I keep meeting teams that ship multi-agent systems on top of LangChain or CrewAI, get to production, and then realise nobody has a good answer to "what if the model picks the wrong tool" or "what if one of the agents gets compromised." The honest answer is usually a Slack channel and a prayer. Sometimes a regex on outbound traffic, which catches roughly nothing.
This post is the version of that conversation I now have on repeat. There is one decorator. There is no agent loop rewrite. You keep your LLM provider. You keep your framework. You add three lines at boot and one annotation per tool, and from that point forward every tool call goes through a 9-stage pipeline that catches the categories below.
What you actually get
Clampd sits between your agent and its tools as a self-hosted gateway. Every @clampd.guard-wrapped function call detours through it. Each stage is a specific attack class, not a marketing line.
- Prompt injection blocking. 263 built-in rules in
ag-engine/src/builtins/rules/*.toml, with named IDs you can grep for. R013 catches instruction overrides ("ignore previous instructions"). R014 catches roleplay jailbreaks (DAN, "you are now"). R015 catches delimiter abuse (<|im_start|>,[INST]). R031 catches indirect injection via context. R038 catches system-prompt extraction. - Rug-pull detection. Every tool descriptor (name + description + parameter schema) gets a content-addressable SHA-256 at registration time. The hash travels with every call. If a vendor mutates the schema between deploys, the hash stops matching and the gateway returns
descriptor_hash_mismatch. We wrote about why this matters. - Delegation chain governance. Maximum depth 5. Cycle detection. Per-edge tool restrictions (the parent can say "you may only call
search.web, neverdb.write"). When you lock the workflow, calls in the unapproved direction are denied. When you revoke an edge, the deny propagates in about 20 seconds. - Kill cascade.
ag-kill/src/cascade.rswalks the delegation tree. Kill one agent, every descendant gets killed too, with a reason chain that names the root. Agents the killed one recently touched (but not downstream of it) get enhanced monitoring for 5 minutes rather than a hard kill, which is the right call when the contact might have been incidental. - Behavioral EMA.
ag-riskkeeps an exponentially-weighted moving average of risk per agent. Alpha is 0.3. A score of 0.7 pages on-call. A score of 0.9 auto-suspends throughag-kill. Idle agents decay back toward zero. We arrived at those numbers by running the redteam corpus and watching the false-positive rate. - Response inspection.
check_response=Trueon the guard, or an explicitclient.scan_outputcall, scans tool return values for PII and secrets before that data re-enters the LLM context. The hit list comes back as actual matches, not just a boolean. - Scope tokens.
ag-tokenmints a per-call signed JWT. Downstream tools can verify it withclampd.verify_scope_token. Each token is single-use. Replays get rejected. - Rate limiting and circuit breakers. Per-agent counters in Redis. Client-side circuit breaker in the SDK (
cb_threshold,cb_reset_timeout_ms) so a flaky gateway degrades gracefully instead of taking down your agent.
What the live feed actually looks like
Talking about layers is one thing. Here is what the Risk Feed in the dashboard showed on our dev cluster while we were writing this post. Real agent IDs, real rule hits, real risk scores out of ag-risk. Nothing rehearsed.
Risk Feed · last 8 events
LIVEA few notes on those rows, because they each correspond to a different layer firing.
- The two
delegation_not_approvedblocks at the top are the workflow lock doing its job. The orchestrator-to-research edge was approved. The reverse was not. The reverse call gets denied atag-policywith a directional Redis key shape (ag:delegation:approved:{caller}:{target}). - The
task_replay_detectedhit at 0.90 is whatag-risk's replay detector throws when the same (caller, target, tool, params, trace) tuple is seen twice inside 60 seconds. The score is high enough to put the agent within striking distance of auto-suspend. - The three
PRE_REGISTRYblocks all came from a single redteam pass that tried to register an agent with a confused identifier. 🤖, a zero-width space hiding mid-string, and a Cyrillicаthat looks identical to ASCIIa. All denied before the agent record was created. descriptor_hash_mismatchis the rug-pull catch. Between registeringweb.searchat boot and calling it later, the descriptor was mutated. The hashes did not match. The call never went out.
Lock the delegation graph from the dashboard
Decorating tools gets you the data. Approving the graph is what turns it into enforcement. The dashboard has an A2A page that mirrors what's happening in agent_relationships, and you operate on it like this.
Three buttons, three states, three things they do.
- Observed. The default when
ag-controlauto-discovers a new edge. It records that orchestrator called research, keeps an observation count, and never blocks. We sawtest-orchestrator → test-researchhit 2530 observations before we touched anything.test-research → test-writergot to 221. The auto-discovery loop runs every 5 minutes (ag_control::local_sync: Workflow auto-discovery completed) and groups related edges into a workflow record. - Approved. One click on "Approve edges" promotes every observed edge in this workflow. The status flips from
observedtoapprovedinagent_relationships, anapproved_attimestamp gets written, andag-control's delegation-sync loop pushes a directional Redis key (ag:delegation:approved:{caller}:{target}) every 15 seconds. - Locked. Hit "Lock workflow" and the workflow's
enforcement_modeflag flips to true. The next sync tick writesag:delegation:enforcement:{org_id}to Redis. From that moment on,ag-policyreads the flag on every call. If the caller and target don't appear in an approved edge, the request is denied withdelegation_not_approved. Direction matters: an approval of A→B doesn't authorise B→A. That's enforced at the key shape level. - Unlinked. "Unlink" deletes the edge. The deny propagates within roughly 20 seconds - 15 seconds for the sync tick plus another 5 for the SDK's L1 cache to expire. Any in-flight calls finish, but the next attempt is denied with the same reason.
We exercised the round-trip on dev. Three SDK calls, no manual surgery on Redis or Postgres at any point.
- Forward call (approved direction). Orchestrator delegated to research, research called
data.analyze, gateway returnedSUCCESS. Row 5 in the feed above, the{q:"enforce-test"}allow. - Reverse call (no approval for B→A). Research tried to delegate back to orchestrator. Denied with
delegation_not_approved. That is the 05:10:21 block. - Revoke A→B and retry. Clicked Unlink in the dashboard, waited 22 seconds for sync plus cache, ran the forward call again. Denied with
delegation_not_approved(the 05:10:45 block). Restored the edge afterwards. Total round-trip about 40 seconds wall-clock.
What the graph looks like on the dashboard
The workflow page renders agent_relationships as a node-and-edge graph. Edge counts are the actual observation_count column. The version below is a static reproduction of what we saw at the time of writing.
The graph is identical in spirit to the in-product page - same node shapes, same colour code, same metadata in the corners. Auto-discovery runs in ag-control::local_sync on a 5-minute tick. Approval and lock both happen in the same panel. You don't have to write a single SQL statement or touch Redis to move an edge between states.
Cryptographic enforcement: signed delegation proofs
Locking the graph governs which delegations are allowed. Signing the proof governs that the delegation is real. With CLAMPD_DELEGATION_SIGNATURES=on, every multi-hop call carries an HS256-signed JWT minted by the leaf agent with its own credential hash. The gateway looks up the same hash in ag:agent:cred:{leaf} and verifies four things on every hop: the signature, the audience, the expiry (30 second TTL), and a SHA-256 of the delegation chain itself so a proof minted for chain [A, B, C] can't be replayed under [A, B, X].
Four scenarios, run against the production dev cluster:
200 ok · confidence=verified
403 delegation_signature_required
403 chain_hash_mismatch
403 jwt_invalid
S2 catches a hostile or out-of-date SDK that doesn't sign. S3 catches token replay across chains - the canonical case where a compromised intermediary tries to graft a stolen proof onto a new path. S4 catches the simplest forgery attempt: any agent signing on behalf of another. The verifier code is ag-gateway/src/delegation.rs::verify_signed_delegation; the same chain-hash function lives byte-for-byte in sdk/python/clampd/auth.py and sdk/typescript/src/auth.ts, with a Rust regression test that asserts the literal SHA-256 so a future drift breaks CI.
Kill cascade: what happens when you pull the plug
Approval governs the steady state. Enforcement catches violations. Kill is what you do when one of your agents is already compromised and the rest of the tree needs to come down with it.
The dashboard kill endpoint inserts a kill_agent command. ag-control picks it up over WebSocket and calls ag-kill with cascade_descendants=true by default. From there, the cascade walks the agent_relationships table and revokes every descendant in the tree up to depth 5. The credential hash is nulled on each agent, so the next gateway call from any of them fails with invalid_jwt: Agent authentication failed.
Live test against the dev cluster, killing a freshly-provisioned 3-agent tree:
What the test asserted, in order:
- Provisioned three fresh agents via the dashboard API, granted them
net:*, db:*, comms:*scopes. - Ran a depth-3 delegation chain to populate
agent_relationships:kc-orch → kc-research → kc-writer. - Hit the kill endpoint on the root with reason
e2e kill-cascade test. - Eight seconds later, called the gateway as
kc-orchvia the SDK. Got backinvalid_jwt: Agent authentication failed. This usually means the agent is suspended. - Read the dashboard for all three agents. Root was killed with the test reason. Both descendants were killed with reason
cascade: parent ea4e384d-d3e5-4b21-94e6-64ec540457.
Eight seconds, end-to-end. That's the wall-clock time from "click kill" to "every agent in the subtree refuses to authenticate." The 5-minute auto-discovery loop has nothing to do with it - kill is event-driven through ag-control's WebSocket command stream, not the polling sync.
The 10-minute onboarding
1. Install
# Python
pip install clampd
# TypeScript
npm install @clampd/sdk
2. Init at startup
import os
import clampd
clampd.init(
agent_id="orchestrator",
gateway_url=os.environ["CLAMPD_GATEWAY_URL"], # e.g. http://localhost:8080
api_key=os.environ["CLAMPD_API_KEY"],
secret=os.environ["ORCHESTRATOR_SECRET"],
agents={
"orchestrator": os.environ["ORCHESTRATOR_SECRET"],
"research-agent": os.environ["RESEARCHER_SECRET"],
"writer-agent": os.environ["WRITER_SECRET"],
},
)
Each agent has its own JWT secret, its own kill switch, its own EMA score. The gateway never sees your LLM provider keys. Org-scoped X-AG-Key authenticates the application; per-agent JWTs authenticate which agent is making any individual call.
3. Declare your tools at boot
from clampd import Category, Subcategory, Operation
clampd.register_tool(
"db.query",
category=Category.DB,
subcategory=Subcategory.QUERY,
operation=Operation.READ,
description="Read-only SQL against the analytics replica",
param_schema={"type": "object", "properties": {"sql": {"type": "string"}}},
)
This pre-classifies the tool against the taxonomy in ag-common/src/categories.toml. If you skip it, the first runtime call backfills the descriptor automatically, but you spend that first call in a "pending" state where stricter rules apply. Registering at boot is the trade you want.
4. Wrap an existing function with one line
This is the part that surprises people. The agent loop, the framework, the LLM client, all stay the same. You add one decorator.
Before:
def run_query(sql: str) -> str:
return db.execute(sql)
After:
@clampd.guard("db.query")
def run_query(sql: str) -> str:
return db.execute(sql)
That is the entire change. On every call the decorator:
- Enters a delegation context (tracks the caller chain, checks depth and cycles).
- Posts to the gateway's
/v1/proxywith the tool name, params, and cached descriptor hash. - Raises
ClampdBlockedErrorif the gateway denies, withmatched_rulesandrisk_scorepopulated so your error handler can branch. - Snapshots kwargs with
copy.deepcopybefore executing the underlying function. This closes a TOCTOU window between the policy check and the actual call. - Stashes the returned
scope_tokenin acontextvars.ContextVarso the tool itself can verify it.
Add check_response=True to also scan whatever the function returns. PII and secrets in tool responses are how exfiltration usually gets back into the LLM context.
5. Compose a 3-agent delegation chain
with clampd.agent("orchestrator"):
plan = run_planner(goal)
with clampd.agent("research-agent"):
sources = guarded_search(query=plan.query)
with clampd.agent("writer-agent"):
draft = guarded_write(sources=sources)
The chain orchestrator -> research-agent -> writer-agent propagates through contextvars. Every guarded call inside sees the full chain and forwards delegation_chain plus delegation_trace_id to the gateway. Try to exceed depth 5 and the SDK raises ClampdBlockedError before the request leaves your process. Try to cycle and same story.
In TypeScript the equivalent is clampd.agent("orchestrator", async () => { ... }). The withDelegation primitive still exists for backward compatibility but new code should use clampd.agent.
6. Scan inputs and outputs explicitly when you need to
If your agent does its own LLM call (you have no OpenAI or Anthropic SDK in the path, you talk HTTP directly), the SDK exposes the scanners as plain functions:
scan = client.scan_input(user_prompt)
if not scan.allowed:
raise RuntimeError(f"blocked: {scan.denial_reason} rules={scan.matched_rules}")
output_scan = client.scan_output(llm_response)
if not output_scan.allowed:
handle_pii(output_scan.pii_found, output_scan.secrets_found)
scan_input runs prompts against the prompt-scoped rules. scan_output runs PII and secret detection and returns the actual hits, not just a boolean.
Framework adapters
If you don't want to decorate every function, the SDK ships one-line wrappers for the common frameworks. Each one does the same job at a different integration point.
- LangChain.
clampd.langchain(agent_id="...")returns aBaseCallbackHandler. Attach it viaagent.invoke(input, config={"callbacks": [...]}). Source:sdk/python/clampd/langchain_callback.py. - CrewAI.
clampd.crewai(agent_id="...")returns a guard withstep_callbackandwrap_tool. Source:sdk/python/clampd/crewai_callback.py. - Google ADK.
clampd.adk(agent_id="...")returns abefore_tool_callback, plusafter_tool_callbackwhencheck_response=True. - OpenAI.
clampd.openai(OpenAI(), agent_id="...")is a drop-in client.chat.completions.creategets wrapped, tool calls get guarded, streaming works throughguard_openai_stream. - Anthropic.
clampd.anthropic(Anthropic(), agent_id="...")does the same formessages.create. - MCP servers. Run
python -m clampd.mcp_server --downstream "..." --agent-id ... --gateway ...to expose any downstream MCP server through the pipeline. Drop the result into Claude Desktop'smcpServersconfig and the model never talks to the raw server again. - AutoGen. Not shipped in the SDK yet. We mention it in the static scanner because
clampd scancan find AutoGen usage in your repo, but there is no runtime adapter today. If this matters to you, file an issue.
Hashing tool descriptors protects between deploys, not between minutes inside one MCP session. If a server mutates a tool while a long-lived session is open and the agent never re-discovers tools, the original hash still validates. Reasonable real-world MCP clients re-discover, but if yours doesn't, you have a smaller window the hash can't cover. We are working on streaming re-discovery and would take a PR if you've solved this elegantly.
What you don't need to do
The frequent objection to wrapping anything is "I can't rewrite my system." Fair. Here is what you don't have to give up.
- You don't rewrite the agent loop. Every adapter is opt-in at the edges. The loop stays yours.
- You don't switch LLM providers. Clampd is provider-agnostic. OpenAI, Anthropic, Bedrock, Vertex, local Ollama, anything that fits behind a
messages.createor a custom call. The gateway never receives your LLM API keys. - You don't deploy SaaS. Self-hosted Docker Compose, source-available under BSL-1.1. Ten services (
ag-gateway,ag-intent,ag-policy,ag-token,ag-shadow,ag-risk,ag-kill,ag-control,ag-registry, plusag-redteamfor the live demos). Postgres, Redis, NATS. That's it. - You don't hand us your secrets. Each agent's JWT secret stays client-side. The gateway sees the signed JWT plus your org-scoped
X-AG-Key. Nothing else.
The trust model in three lines
Per-agent API key, per-agent HS256-signed JWT, per-agent kill switch. The gateway never sees your LLM provider credentials. Kill one agent and the 8-layer cascade in ag-kill/src/cascade.rs walks the tree up to depth 5 and revokes every descendant, with a reason chain that names the root.
Add @clampd.guard to one tool today. The rest can wait. One protected tool call gives you the audit trail, the kill switch, and the per-call signed authorization. Everything else is incremental from there.
Two patterns I see most often
The first is teams that wrap their planner with clampd.agent("orchestrator") and decorate two or three high-risk tools (anything that writes, anything that talks outbound). Within a week they have enough audit data to argue with security about what's actually happening at runtime. That conversation used to go badly. Now it goes "here is the trace."
The second is teams that watch the auto-discovered graph for a week, lock it, and from that point on every new edge is queued for review instead of silently working. The risk feed and the workflow lock cycle in the section above is what that day-two operation actually feels like. Forward direction passes, reverse is denied, revocation propagates in roughly 20 seconds. No SQL, no Redis surgery, no manual JWT minting.
Closing
If you have one production agent and zero protection, you are not in a worse spot than most of the industry. You are also one decorator away from a real audit trail. Start with pip install clampd, point at a local gateway, decorate one tool. Watch the audit log fill up. Decide what else you want to guard.
For the adversarial side of this, the live demos at redteam.clampd.dev run the same attack categories against the same gateway code your SDK is about to call. Prompt injection, rug-pull schema swaps, delegation cycle attempts, kill cascade. You can rerun any of them against your own deployment.
Try Clampd in 60 seconds
One line of Python or TypeScript. Works with OpenAI, Anthropic, LangChain, CrewAI, Google ADK, and any MCP server. Self-hosted, source-available, no telemetry by default.
pip install clampd npm install @clampd/sdkGet Started → Why Clampd