OWASP LLM Top 10 to 263 runtime rules: a complete mapping

If you're a security or GRC team trying to map AI agent risk to a framework you already trust, this is the table. We've gone category-by-category through the OWASP LLM Top 10 (2023) and counted exactly which of our 263 detection rules cover what. Including the gap.

Security frameworks make conversations possible. "Did you address LLM06?" is a productive question. "Did you address sensitive information disclosure?" is a fuzzier one that takes 30 minutes longer. So when teams adopting AI agents ask us how Clampd's detection coverage maps to OWASP, we'd rather give them a verifiable answer than a brochure.

The numbers below come from a one-line grep across our rule TOML files. Every rule in services/crates/ag-engine/src/builtins/rules/*.toml can carry a taxonomy.owasp_llm tag. We aggregated all 13 TOML files and counted.

The mapping

Counts are unique rules tagged with each category. Some rules cover more than one (e.g. a "leak secrets in LLM output" rule maps to both LLM06 and LLM02), so the column sums to more than 263.

OWASP	Category (v2023)	Rules tagged	What we catch
LLM01	Prompt Injection	36	Direct injection (ignore-previous-instructions and 19 multilingual variants), indirect injection in tool params and tool responses, system-prompt-extraction attempts, role-confusion, soft injection phrases, jailbreak DAN-class.
LLM02	Insecure Output Handling	15	PII or secrets reaching the LLM output, structured data leaks, schema-injection in returned tool descriptions, output-side scanner (configurable in SDK via `scan_output`).
LLM03	Training Data Poisoning	1	Limited coverage and worth flagging. Clampd is a runtime control plane; training-data hygiene is upstream. The single rule we tag here flags context poisoning (`flag-context-poisoning`) as a runtime indicator that a poisoned input has reached the agent.
LLM04	Model Denial of Service	2	Per-session tool-call budget and token-limit checks. The bulk of DoS protection is rate limiting at the gateway layer (license_ratelimit, session_toolauth), not rules.
LLM05	Supply Chain Vulnerabilities	13	Tool descriptor mutation (rug-pull) detection via SHA-256 contract hash, MCP server impersonation, agent-to-agent schema weakening, dependency-confusion patterns in tool params.
LLM06	Sensitive Information Disclosure	47	PII patterns (email, SSN, credit card, phone, MRN, DOB, IBAN, UK NIN, French INSEE/NIR), secrets (.env, .ssh, AWS keys, vault tokens, npmrc, pgpass), business-confidential markers, region-specific identifiers (Aadhaar, PAN, NRIC).
LLM07	Insecure Plugin Design	77	Largest category by rule count, because Clampd is a tool/plugin firewall. Destructive SQL, command injection, SSRF (incl. cloud metadata), path traversal, reverse shells, persistence mechanisms, schema injection in tool definitions.
LLM08	Excessive Agency	58	Scope-violation detection, cross-agent privilege escalation, unauthorized delegation, tool authorization checks (Stage 4.5 in the gateway), boundary-breach detection (writing outside approved categories).
LLM09	Overreliance	3	Limited. We flag agent decisions that ignored a model-flagged warning and proceeded anyway. Most overreliance defence is application-level UI work, not runtime rules.
LLM10	Model Theft	0	Honest gap. Model-theft is a content/network-egress problem more than a tool-call-firewall problem. Our session layer's slow-drip and volume-anomaly patterns catch the symptoms but no rule explicitly tags LLM10. We'd build category-specific coverage if customer signal pulls us there.

Total unique rules carrying at least one OWASP LLM tag: 236 of 263. The remaining 27 are operational rules (delegation chain integrity, mandate validation, infrastructure-specific patterns) that don't map cleanly to OWASP's categories but are still part of the live ruleset.

Honest reading of this table

Don't read coverage as protection

Tagging 47 rules with LLM06 doesn't mean we will catch every sensitive-information disclosure. Specific PII formats, novel encodings, indirect leakage chains, side-channel exfiltration, all of these require their own work. The number tells you how much rule mass is pointed at a category. It doesn't tell you the false-negative rate. We separately maintain a 556-payload regression corpus to measure that, which we'll write about in a follow-up.

The v2023 to v2025 reshuffle

OWASP shipped a v2025 of the LLM Top 10 that renumbered and reorganised the categories. If your team is using v2025, here's the rough mapping back to our v2023 tags.

v2025	v2025 Title	v2023 Equivalent in Our Tags	Coverage
LLM01	Prompt Injection	LLM01	36 rules
LLM02	Sensitive Information Disclosure	LLM06	47 rules
LLM03	Supply Chain	LLM05	13 rules
LLM04	Data and Model Poisoning	LLM03	1 rule
LLM05	Improper Output Handling	LLM02	15 rules
LLM06	Excessive Agency	LLM08	58 rules
LLM07	System Prompt Leakage (NEW)	part of LLM06 + LLM02	Multiple rules cover system-prompt extraction; not a distinct tag yet.
LLM08	Vector and Embedding Weaknesses (NEW)	not tagged	0 rules
LLM09	Misinformation	LLM09	3 rules
LLM10	Unbounded Consumption	part of LLM04 + LLM10	2 rules

We're working through the v2025 retag in the rule TOMLs. The data above maps from v2023 because that's what's currently tagged in code; we'll publish an updated table when v2025 is canonical.

How to use this in a vendor questionnaire

If a security vendor or auditor asks "what's your OWASP LLM coverage", the honest answer is per-category and includes both numerator (rules in this category) and known gaps. Here's a one-paragraph version you can paste:

Paste-able summary

Clampd ships 263 runtime detection rules tagged against OWASP LLM Top 10 (v2023): LLM01 36 rules, LLM02 15, LLM03 1, LLM04 2, LLM05 13, LLM06 47, LLM07 77 (largest, covers tool-call firewall scope), LLM08 58, LLM09 3, LLM10 0. Total 236/263 rules carry at least one OWASP tag. v2025 retag in progress; LLM08 (vector/embedding weaknesses) is currently 0 because we are a runtime tool-call firewall and don't sit on the embedding pipeline.

Beyond OWASP: regulation tags

Each rule also carries a regulations tag with mappings to HIPAA, GDPR, SOC2, PCI-DSS, and CCPA. The TOML for our destructive-SQL rule, for example, looks like:

[[rule]]
id = "R001"
name = "block-destructive-sql"
risk_score = 0.95
action = "block"
...
[rule.taxonomy]
atlas       = ["AML.T0051"]
owasp_llm   = ["LLM07"]
regulations = ["PCI-DSS", "SOC2"]

Same idea: a rule can be queried by OWASP category, by MITRE ATLAS technique ID, or by regulation. The dashboard's compliance report templates (HIPAA, GDPR, SOC2, PCI-DSS) read those tags to generate evidence views. The CCPA tag exists on rules but doesn't have a dedicated report template yet.

What this means for buying decisions

If your team's evaluation criteria is "does this product cover the OWASP LLM Top 10", almost any of the products in our compare page can answer "partially yes" because OWASP is broad. The interesting questions one level down:

Where on the call path does each category get caught? Prompt-only safety filters catch LLM01 at the prompt boundary. Tool-call firewalls catch LLM01 again at the tool boundary, plus LLM07 (insecure plugins) which prompt-only tools can't see at all.
How many rules per category, and how often do they update? A vendor that says "LLM06 supported" with one rule is technically correct but practically thin. Ask for the per-category rule count.
What's the honest gap? The vendor that volunteers "we have zero coverage on LLM10 because that's not where we play" is the vendor whose other claims you can probably trust.

See the rule corpus yourself

The 263 rules live in plain TOML at services/crates/ag-engine/src/builtins/rules/. Each carries OWASP, MITRE ATLAS, and regulation tags. Dashboard compliance reports query against them.

pip install clampd npm install @clampd/sdk
Get Started → Compare to alternatives

← Back to blog Share on X →