Session layers: 16 patterns we use to catch multi-step agent attacks

Most AI security tools inspect tool calls one at a time. The interesting attacks don't happen one at a time. They happen across calls: scrape, then exfiltrate. Recon, then escalate. Slow drip across an hour. We run sixteen cross-call patterns on every request and let them feed a per-agent risk score. Here is exactly what each one does.

If you only inspect each tool call in isolation, you can be perfectly accurate per-call and still miss the actual attack. Real agent compromise rarely shows up as a single malicious payload. It shows up as a sequence: a sequence of small reads, then a write to an outbound channel; a sequence of probing calls across different scopes; a sequence of calls whose risk score keeps inching up but never quite crosses the block line.

This is the gap session detection fills. Below are the sixteen patterns Clampd runs on every classify request, what each one fires on, and what we use them for.

The two buckets: flag-based and context-based

The patterns split into two layers.

Flag-based patterns read from a list of session flags accumulated across previous calls in the same session (e.g. bulk_read, schema_recon, denied). They run cheaply, with no JSON parse required. Eight of the sixteen patterns are flag-based.

Context-based patterns require a parsed session context: histograms of categories, rolling totals, recent risk trajectory. These are heavier (JSON parse, math) but they catch the patterns that flag-counting can't, like sawtooth evasion or volume anomalies.

The sixteen patterns

Names below match the function names in services/crates/ag-intent/src/session/patterns.rs. If you read the source, you'll find the actual thresholds (configurable per-pattern).

Pattern	Layer	What it fires on
read_then_exfil	flag	Bulk read followed by an outbound call. The classic exfiltration trail.
recon_then_attack	flag	Schema or capability probing followed by a destructive action against the same target.
scope_probing	flag	Repeated denied calls across different scopes. The agent is "trying doors."
rapid_fire	flag	Burst of calls inside a short window. Often a precursor to brute-force or scraping.
tool_diversity_spike	context	Sudden expansion in the variety of tools the agent calls. Usually means new capability acquisition.
volume_anomaly	context	Outbound data volume above the agent's rolling baseline.
permission_escalation	flag	Calls climbing the privilege ladder: read, then write, then admin.
cross_tool_bridging	context	Connecting categories that don't normally talk: data from `db:` ending up in `net:` within the same session.
cross_agent_escalation	flag	An agent acquires capabilities through a delegated agent that exceed its own.
unauthorized_delegation	flag	Delegation to an agent that wasn't approved as a downstream peer.
first_time_tool	flag	Calling a tool the agent has never called before. Light signal on its own; valuable as a multiplier.
escalating_risk	context	Per-call risk score trending up over the last N calls.
sensitive_data_flow	context	Data tagged sensitive (PII, secrets) flowing toward an outbound category.
sensitive_chain	context	Chain of sensitive accesses across multiple tools in the same session.
slow_drip	context	Small amounts of data leaving over time. Below per-call volume threshold but adding up.
interleaved_evasion	context	Sawtooth risk score: deliberately oscillating high and low to never trip the per-call block threshold.

Why sawtooth evasion deserves its own pattern

An attacker who knows your block threshold sits at, say, 0.85, can deliberately structure their requests to bounce between 0.7 and 0.4. Each individual call passes. The agent never gets blocked. The attack succeeds.

Single-call inspection has no answer for this. You need to look at the shape of the recent risk trajectory. interleaved_evasion looks for direction changes in the last 6 risk-score data points; if the trajectory shows N+ direction changes in a small window, that's not a normal workload, it's evasion.

// Detects when risk_trend shows a sawtooth pattern (up-down-up-down).
pub fn check_interleaved_evasion(ctx: &SessionContext) -> Option<PatternResult> {
    // N+ direction changes in 6 data points = sawtooth pattern
    ...
}

This pattern is one of the reasons single-stage WAFs miss a class of agent attacks they otherwise would catch. WAFs are stateless per-request by design.

How patterns feed the risk score

Each pattern that fires returns a PatternResult with a flag name and a risk_bonus. The risk bonus is added to the assessed risk for the current call. So a perfectly safe-looking call ("read one row") can become high-risk if multiple session patterns fire on it ("read one row" + slow_drip active + sensitive_chain active).

The flag is also persisted into the session for future calls. cross_tool_bridging firing on call N becomes a session flag that other patterns can read on call N+1.

This is intentional: it means an attack that took ten calls to set up still leaves a trail that the eleventh call sees. The session is the unit of detection, not the call.

Worked example: the slow-drip exfiltration

An attacker compromises an agent with read access to a customer database and write access to a webhook URL. Their goal is to exfiltrate the customer table. They know the agent's risk-block threshold and the per-call volume cap.

Single-call defence: the agent reads 50 rows at a time. Volume is below threshold. Each call passes.

What our session layer sees:

Call 1: db:query:read. Volume: 50 rows. Per-call risk is low. first_time_tool fires once, contributing a small bonus. Allowed.
Calls 2–8: more 50-row reads. volume_anomaly doesn't fire because the rolling avg is now 50 rows. But slow_drip starts watching the cumulative pattern.
Call 9: net:webhook:send with 200 KB body. sensitive_data_flow fires: data tagged from db:* is leaving via net:* in the same session. cross_tool_bridging fires too. The bonuses combine on top of the per-call rules score.
Call 10: another 50-row read. slow_drip now correlates the rolling outbound total against the rolling read total and contributes its own bonus.
Call 11: another net:webhook:send. read_then_exfil fires (flag-based, looking at recent history). sensitive_chain fires (multi-tool chain of sensitive accesses). The accumulated session-derived bonus crosses the auto-suspend threshold, the kill cascade fires, the agent's tokens get revoked fleet-wide.

The slow-drip attack succeeded against per-call inspection. It didn't survive the session layer. (Per-pattern risk_bonus values vary by pattern and are tunable per org; the source of truth is services/crates/ag-intent/src/session/patterns.rs.)

What this isn't

Session patterns are not full behaviour modelling. They are an explicit, hand-built ruleset for known multi-step attack shapes. They catch what we know how to describe. They will miss attacks we haven't seen yet. We separately run a per-agent EMA risk score (the behavioural baseline layer) that catches "this agent is doing things it never used to do" without us having to enumerate the shape. Both layers feed the same final score.

Cross-agent: the special case

Two of the sixteen patterns specifically address agent-to-agent (A2A) workflows: cross_agent_escalation and unauthorized_delegation. These are critical because a delegation chain shifts the trust boundary: an agent that can't itself perform action X may legitimately delegate to an agent that can.

The patterns watch for two specific abuses:

Privilege escalation through delegation: agent A delegates to agent B specifically because B has scopes A doesn't. A then uses B's results to take an action it couldn't have taken itself.
Delegation outside the approved peer graph: every org configures which agents may delegate to which others. A delegation to a peer not in that allowlist is flagged regardless of what the called agent actually does.

Both pair with the tool descriptor hash check to make sure each agent in the chain is also calling the tool it actually got approved for.

What you can do without Clampd

Session detection isn't magic, it's just data we don't normally collect. Three suggestions if you're rolling something yourself.

Make session_id mandatory in your audit schema, today. If your audit log only has tool name and risk score per call, you'll never reconstruct multi-step attacks after the fact. Add session_id + caller_agent_id + delegation_chain to every event. The cost is a few bytes per row.
Pick the three highest-leverage patterns first. If you only build three: volume_anomaly (rolling outbound vs baseline), sensitive_data_flow (data from sensitive sources reaching outbound categories), and scope_probing (denials across different scopes). Those three catch most of the basic attacks.
Build the sawtooth check last. It's the most useful, but only if you already have per-call risk scores you trust. Without those, sawtooth is just noise.

Try Clampd in 60 seconds

One line of Python or TypeScript. Works with OpenAI, Anthropic, LangChain, CrewAI, Google ADK, and any MCP server. Self-hosted, source-available. The 16 session patterns are on by default; thresholds are tunable per-org.

pip install clampd npm install @clampd/sdk
Get Started → Why Clampd

← Back to blog Share on X →