The kill switch: how Clampd stops a rogue agent across 8 layers in milliseconds

When an AI agent goes wrong in production, every minute between detection and revocation is a minute of additional risk. Detection without response is just logging. Here's the 8-layer cascade we run when an agent gets killed: each layer independent, each idempotent, total bounded latency budget under one second.

An incident-response question every security architect asks (and most AI products dodge): "When you decide an agent has been compromised, how exactly do you stop it from doing the next thing?" There's a real answer or there isn't, and the answer separates products that can be deployed in regulated environments from products that get bounced in vendor security review.

This post is the answer for Clampd. The cascade structure below is documented in services/crates/ag-kill/src/cascade.rs with these specific layers and these specific latency budgets, and you can read the source.

How a kill gets triggered

Three trigger paths. Each lands the same cascade.

Auto-suspend (automated). ag-risk continuously scores every agent's behaviour with an EMA model. When the per-agent risk crosses AUTO_SUSPEND_THRESHOLD (default 0.9), ag-risk calls ag-kill directly via gRPC. No human in the loop. Used for clear-cut compromise (cross-agent privilege escalation, severe rule violations, sustained anomaly).
Manual (dashboard). A human operator clicks "kill" or "suspend" on the dashboard's agent panel. The dashboard API tells ag-control over WebSocket; ag-control invokes ag-kill via gRPC. Requires owner or admin role. The reason is recorded in the audit trail.
Programmatic (HTTP fallback). The dashboard exposes POST /v1/runtime/agent-state with body { agent_id, new_state: "killed", reason }. Used by your incident-response automation, your SIEM playbooks, and as the fallback when the WebSocket path between ag-control and the dashboard is down. Internal services prefer the gRPC KillService.KillAgent RPC directly when they have HMAC auth set up.

Whichever trigger fires, the request lands at execute_cascade() in ag-kill's cascade module. The agent's fate from that point on is determined by what happens in the next eight steps.

The 8 layers

This table is copied directly from cascade.rs:1-17. The latency column is the per-layer budget; the cascade fans out and runs them in parallel where dependencies allow.

Layer	Action	Target	Budget
1	Deny list SET	Redis `ag:deny:{id}` TTL=24h	<1ms
2	Gateway broadcast	NATS PUBLISH `agentguard.kill`	<1ms
3	Token cache flush	gRPC → ag-token RevokeAgent	<5ms
4	Session termination	Redis SCAN+DEL `ag:session:*`	<5ms
5	IdP session revoke	via Layer 3 → ag-token → IdP	<500ms
6	Agent state change	gRPC → ag-registry	<5ms
7	Event broadcast	NATS PUBLISH `agentguard.events`	<1ms
8	Audit log	PostgreSQL INSERT `kill_audit`	<5ms

What each layer actually does for the security model.

Layer 1 — Deny list SET (Redis, <1ms)

The first thing that happens, every time. Redis SET on ag:deny:{agent_id} with a 24-hour TTL. Every Clampd gateway instance reads this key on every incoming proxy call. The moment this SET completes, no further tool calls from the agent will pass auth, regardless of whether the rest of the cascade succeeded. This layer alone is the difference between "stopped now" and "stopped eventually." It runs first for that reason.

Layer 2 — Gateway broadcast (NATS, <1ms)

NATS PUBLISH on agentguard.kill tells every gateway instance in the fleet "this agent is dead now." Gateways listening to this subject invalidate their local caches. Without it, a gateway with a hot in-memory token cache could allow one or two more calls before the cache TTL expires. With it, those caches are flushed within tens of milliseconds. Published to JetStream so the message survives consumer restarts.

Layer 3 — Token cache flush (gRPC → ag-token, <5ms)

gRPC to ag-token's RevokeAgent. Every scope token issued to this agent is now considered revoked. Tokens that the agent has already obtained but not yet spent become useless. JWKS verification still passes (the signature is fine), but the token's jti is on a revocation list that ag-token consults.

Layer 4 — Session termination (Redis SCAN+DEL, <5ms)

Every ag:session:* key for the agent gets deleted. Session-derived state, in-flight session-pattern flags, the rolling EMA cache, all wiped. The agent restarts from clean state if it ever comes back, but more importantly nothing about its prior session state can be referenced by an attacker who somehow keeps it alive.

Layer 5 — IdP session revoke (<500ms)

If the agent was issued tokens that chain through to an external IdP (e.g. Keycloak, Okta), this layer triggers IdP-side revocation. This is the slowest layer in the cascade by an order of magnitude because it's a network round-trip to an external system. It's last among the synchronous layers for that reason. If the IdP is unreachable, the cascade still completes — the deny list (Layer 1) means no further calls succeed regardless.

Layer 6 — Agent state change (gRPC → ag-registry, <5ms)

Marks the agent as SUSPENDED or KILLED in PostgreSQL. This layer retries 3 times. If all 3 fail, the deny TTL is extended to 24 hours so the agent stays denied even without registry consensus. This is the explicit "if the database is down, security still holds" guarantee.

Layer 7 — Event broadcast (NATS, <1ms)

Different subject from Layer 2. agentguard.events is the kitchen-sink event stream for everything the dashboard, SIEM integrations, and webhook delivery service consume. Tells your Slack channel, your PagerDuty, your custom incident automation that this just happened.

Layer 8 — Audit log (PostgreSQL INSERT, <5ms)

Permanent record. Agent ID, reason, who initiated, timestamp, results from each layer. This is the row you'll be looking at in three weeks when someone asks "wait, why did we kill that agent?" and you have to give a defensible answer.

Independence and idempotency

Two design properties of the cascade matter for production safety.

Each layer is independent

Layer 5 is slow and depends on an external IdP. Layer 6 retries 3x. Neither blocks the others. If Layer 5 times out, Layer 6 still updates the registry. If Layer 6 fails 3 times, Layer 1 is still in place denying calls. The cascade returns a per-layer result list so the dashboard can show "killed: layers 1, 2, 3, 4, 6, 7, 8 succeeded; layer 5 timeout." Operators get to see exactly what happened.

Fully idempotent

Killing the same agent twice produces the same final state, not an error. Important because the trigger paths are not coordinated: the dashboard kill, the auto-suspend, and an API call could all fire within milliseconds. We don't want race conditions to leave the agent half-killed. Every layer's operation is idempotent: SET on a deny list (already there is fine), revoke a token list (already revoked is fine), state change to SUSPENDED (already SUSPENDED is fine).

Cascading to children

If the killed agent has delegated to other agents (using the A2A delegation chain Clampd tracks), the kill optionally propagates down the tree.

pub struct KillContext {
    pub agent_id: String,
    pub reason: String,
    pub initiated_by: String,
    pub revoke_permanently: bool,
    pub kill_sessions: bool,
    // If true, walk the delegation tree and kill all descendants.
    pub cascade_descendants: bool,
    // Maximum depth for the tree-walk cascade (default 5).
    pub max_tree_depth: u32,
}

With cascade_descendants=true, the cascade walks the delegation tree (up to max_tree_depth deep, max 10 concurrent descendant cascades) and runs the same 8-layer kill on every agent in the chain. The result returned to the caller separates root-agent layer results from descendant-agent layer results, so an operator can see "we killed the parent and 4 of its 5 descendants; the 5th descendant cascade had a layer 6 failure."

This matters because in a delegation-chain compromise, the parent may be the proximate detection but the descendants may already be acting on the compromised parent's behalf. Killing only the parent leaves a live attack surface.

What this isn't

Honest limits

The 8-layer cascade revokes future authorisation. It does not undo past actions. If the agent already moved $50,000 to an attacker before the kill fired, that money is gone; the kill doesn't reverse the transaction. Detection latency (how fast we notice the agent is bad) and cascade latency (how fast we stop further damage once we notice) are different problems and the cascade only addresses the second. Fast detection is what the rest of the platform is for.

What you'd actually see in an incident

Dashboard view, abbreviated:

kill agent-7b3a-... · reason "auto_suspend: ema_risk=0.93"

cascade results (run_id 9c2f1e):
  L1 deny_list_set        ok    0.6ms
  L2 nats_kill_broadcast  ok    0.4ms
  L3 token_cache_flush    ok    3.2ms
  L4 session_terminate    ok    4.1ms
  L5 idp_revoke           ok   183ms     // network round-trip
  L6 registry_state_set   ok    2.8ms
  L7 nats_events          ok    0.3ms
  L8 audit_insert         ok    3.7ms

descendants_killed       4
descendants_failed       0
descendant_layer_results (24 layer results, 4 agents x 6 layers)

total cascade duration  198.5ms

Almost all the time is in Layer 5 (the IdP). The other seven layers complete inside 20ms. If your IdP integration is local (cached) instead of remote, you can hit ~20ms total wall-clock for an 8-layer agent revocation.

Why this matters in vendor evaluation

Two questions to ask any AI security vendor talking about kill switches:

"What happens when the kill fails partway through?" If the answer is "we retry the whole thing", they don't have layered isolation. If the answer is "the deny list is still in place so the agent is still stopped, and we surface which layers failed", they thought about this.
"Does the kill cascade descendants in a delegation chain?" If they don't have the concept of delegation chains, they can't propagate. If they do but propagation is "best effort with no result reporting", an operator can't know whether the kill actually worked. The right answer reports per-descendant per-layer status so operators can act on partial failures.

What you can do without Clampd

Make the deny list the first thing you do, always. Whatever architecture you have, set a deny flag in your hot cache before you start tearing down sessions, revoking tokens, calling external systems. The deny flag is what stops further damage. Everything else is housekeeping.
Make layers independent. If your "kill agent" function is a 10-step sequence that aborts on first error, your security model is "agent stays partially alive when something breaks." Independence per step matters more than transactional consistency.
Log per-layer results, not just final status. "kill failed" is useless for incident review. "kill succeeded but layer 5 (IdP revoke) timed out, deny TTL extended to 24h as fallback" is forensic-quality.

Try Clampd in 60 seconds

One line of Python or TypeScript. The 8-layer kill cascade is the same on hosted and self-hosted; trigger via dashboard, auto-suspend at 0.9 EMA, or the dashboard API's /v1/runtime/agent-state endpoint. Self-hosted, source-available, the cascade source is in ag-kill/src/cascade.rs for review.

pip install clampd npm install @clampd/sdk
Get Started → How we detect

← Back to blog Share on X →