The next category of agent risk isn't a clever prompt injection. It's a perfectly normal-looking tool call that ends with money leaving your treasury.
Until recently, the conversation about AI agent risk centred on what an agent could read (data exfiltration) or destroy (DROP TABLE, rm -rf). That conversation is now incomplete. Google's AP2 (Agent Payments Protocol) for cards and bank rails, and the x402 standard for HTTP-native stablecoin payments, are no longer drafts. x402 v2 launched in December 2025; Stripe shipped x402 on Base in February 2026; Cloudflare supports x402 transactions; the AP2 consortium has 60+ organisations behind it including American Express, Mastercard, PayPal, Salesforce, Etsy, and Intuit. Google and Coinbase jointly launched the A2A x402 extension as a production-ready unified offering. Both protocols let an LLM-driven agent transact real money, with very different threat models from "regular" tool calls. Most existing AI security tooling still treats payment-protocol traffic as undifferentiated HTTP, and the gap between "agents are spending money in production" and "your security stack understands the spec" is widening.
This post: what these protocols actually look like on the wire, what can go wrong when agents speak them, and what we built to enforce policy at the protocol layer rather than the application layer. The implementation details below are pulled from services/crates/ag-gateway/src/ap2.rs and x402.rs in our gateway. They are honest about what we do and what we explicitly don't do.
AP2 in 90 seconds
AP2 is Google's protocol for binding a user's payment intent to a specific transaction the agent then executes. The unit of trust is the mandate, a cryptographically signed JSON document called a Verifiable Digital Credential (VDC). There are two flavours.
- Cart Mandate — signed by the user's device at the moment of purchase. Human is present. The mandate names the cart contents, amount, currency, payee, and the user's signature. The agent cannot move money without a fresh, valid Cart Mandate from the user.
- Intent Mandate — signed by the user in advance for cases where the human is not present at execution time. Carries a budget, a TTL, and a description of the kind of purchase the agent is authorised to make. The agent can spend up to the budget, until the TTL expires, on the kinds of purchases the mandate describes.
Both mandates are designed to be passed as parameters on payment-related tool calls. A naive integration just trusts whatever the LLM puts in the mandate field of the call.
x402 in 90 seconds
x402 is an open HTTP-native payment standard. The flow is brutally elegant: a server returns HTTP 402 Payment Required with a payment-required header that base64-encodes a JSON document. The JSON describes one or more accepted payment methods (chain, asset, amount, recipient address). The client (which, in agent land, is an LLM-driven workflow) picks one, signs the payment, retries the request, and gets the resource. The wire format is straightforward.
// What an x402 server responds with
HTTP/1.1 402 Payment Required
payment-required: <base64(JSON)>
// Decoded JSON looks like:
{
"x402Version": 2,
"error": "Payment required",
"resource": { "url": "...", "description": "..." },
"accepts": [
{
"scheme": "exact",
"network": "eip155:8453", // Base mainnet
"amount": "10000", // atomic units
"asset": "0x036C...", // USDC contract
"payTo": "0x2096...",
"maxTimeoutSeconds": 300
}
]
}
The threats nobody is wired up for
Every threat below is observable on the wire. Almost no security tooling we've audited (see our comparison page) has primitives that match.
- Mandate replay. A previously-signed Intent Mandate is reused after expiration or re-used twice within its window. The mandate has a
jtior equivalent ID; without server-side replay protection, an attacker who exfiltrates one mandate can replay it. - Stolen mandate to a different agent. The mandate doesn't bind to a specific agent ID, so if it leaks, any agent can spend against the budget. AP2 v0.2.0 added optional agent_id binding for exactly this.
- Merchant collusion / fake co-signature. A Cart Mandate that lists a payee that wasn't actually approved by the user. If the agent doesn't validate against an approved-payees list, a prompt-injected agent can send money to an attacker-controlled merchant ID.
- Budget overrun. An Intent Mandate authorises $200/month. Nothing in the protocol prevents the agent from spending $190 on day 1 and then $190 again because the wallet's signing flow doesn't see the rolling total.
- x402 chain swap. A server quotes USDC on Base in
accepts; the agent's wallet picks USDT on Polygon because the wallet's selection logic prefers cheaper gas. The agent now spends a different stablecoin than the user expected, on a chain the user may not have approved. - Currency confusion. The amount is "10000" but the asset is some non-stablecoin token whose unit price is volatile. Without verification that the asset is a known USD-pegged stablecoin and the right number of decimals, the agent can move 10x what it thinks.
- SSRF via x402 server. A malicious tool that doesn't actually need payment returns 402 with a valid-looking
payment-requiredblob, just to lure the agent's wallet into broadcasting a real transaction. The 402 IS the attack.
Some of these are caught by the protocol specs. Some aren't. Almost none are caught by general-purpose AI security tools that don't know about mandates and 402s.
What we built
Clampd's gateway intercepts both protocols at the tool-call layer, before any wallet signs anything. The implementation is small, deliberately conservative, and honest about what it can verify cryptographically and what it can't.
AP2 mandate validation
When a tool call carries an AP2 mandate (in mandate, cart_mandate, intent_mandate, or ap2_mandate param keys, raw or base64), the gateway extracts and validates it against per-agent boundary config:
// services/crates/ag-gateway/src/ap2.rs
pub struct Ap2Boundaries {
/// Maximum single payment in cents (0 = no limit).
pub max_payment_per_tx_cents: u64,
/// Maximum hourly spend in cents (0 = no limit).
pub max_payment_per_hour_cents: u64,
/// Approved payee identifiers (empty = all payees allowed).
pub approved_payees: Vec<String>,
}
Concretely, the gateway checks:
- Structure: payer/payee IDs present, amount > 0, currency a recognisable ISO-4217 code, signature well-formed (length and hex validity).
- TTL: Intent Mandate not past expiry. Replay protection by storing the mandate's
jti/idin Redis with a TTL that matches the mandate's own expiry. - Per-tx and hourly cap: amount must be ≤
max_payment_per_tx_cents; rolling hour total (read from Redis under the agent's spend key) must remain undermax_payment_per_hour_cents. - Approved-payee allowlist: when the org has configured one, the mandate's
payee_idmust be in it. - Agent binding: if the mandate specifies an
agent_idbinding (AP2 v0.2.0), it must match the calling agent's identity. Stops mandate theft across agents. - Risk modifier: a valid human-present Cart Mandate adds 0.0 to assessed risk. A valid human-not-present Intent Mandate adds 0.2. An invalid or missing mandate adds 0.5.
x402 interception
For x402, the gateway is positioned in front of the tool's downstream HTTP request. When the downstream returns 402, the gateway parses the payment-required header before the response reaches the agent. It then:
- Validates the JSON shape (required fields, x402 version, accepts array non-empty).
- Resolves the
assetcontract address against a known list of USD-pegged stablecoins (USDC, USDT, DAI, BUSD, TUSD, USDP, FRAX). Unknown assets are flagged as "unverified currency" rather than silently approved. - Checks the
networkagainst the configured chain allowlist (we ship eight networks: Ethereum, Base, Base Sepolia, Polygon, Arbitrum, Optimism, Avalanche, Solana). Unknown chains default-deny. - Computes the worst-case USD amount across all entries in the
acceptsarray (in case the wallet picks a non-default option) and applies the same per-tx and hourly caps as AP2. - Picks the lowest-risk entry and rewrites the response so the agent's wallet sees only that option, not the full
acceptsmenu. The agent loses the choice; you lose the chain-swap attack vector.
Cryptographic verification of mandate signatures (verifying the user's actual signing key, not just signature format) requires CA and key-discovery infrastructure that we don't ship in the open-source proxy. We validate structure, TTL, budget, approved-payee allowlist, agent binding, and signature well-formedness. We do not currently verify the cryptographic signature against the user's public key. That's a roadmap item; the gateway is structured so it can be added without breaking existing integrations. Don't read this post and assume we're doing more than we are.
What this looks like in your config
For an org running an agent that books travel from an Intent Mandate, the per-agent boundary config might look like:
// per-agent boundary configuration
{
"max_payment_per_tx_cents": 50000, // $500
"max_payment_per_hour_cents": 200000, // $2000
"approved_payees": [
"merchant:acme-airlines",
"merchant:globalhotels"
]
}
An LLM-driven agent that talks to a malicious payment endpoint pretending to be a hotel booking gets blocked at the boundary check. The agent sees a typed denial. Nothing leaves the wallet. The audit log gets the full mandate JSON, the matched policy, and the denial reason.
Why the boundary, not the wallet
You might ask: why isn't this enforced inside the wallet itself? Two reasons.
First, the wallet is downstream of every decision the LLM has already made. By the time the wallet is asked to sign, the agent has already chosen a tool, picked parameters, and accepted whatever accepts entry the response offered. Enforcing at the wallet means relitigating decisions the LLM made minutes ago.
Second, the wallet sees one transaction at a time. Per-hour limits, approved-payees lists, and per-agent risk profiles need a service that observes all the agent's payment activity, including failed and aborted attempts, including across multiple tools and multiple wallets. That service has to live above the wallet, at the agent's policy boundary. That's where Clampd sits.
What you can do without Clampd
Even if you never install us, three things are worth doing this quarter for any agent that touches money:
- Log every mandate verbatim into your audit trail. Not just the result. The full signed JSON. When something goes wrong with an agent payment, you want the original mandate to dispute it with the bank, the merchant, or the chain. Without it you've got nothing.
- Add an explicit
approved_payeesallowlist server-side, somewhere your LLM cannot influence. Even a hand-edited YAML in your repo beats nothing. The single most useful question to ask any agent payment integration is "which 12 vendors is this agent allowed to pay?" - Treat
HTTP 402as security-relevant. If your agent can hit any URL on the public internet and any of those URLs can return 402, your agent's wallet is one step removed from broadcasting a transaction. Logging, alerting, or just an explicit allowlist of which downstream tools may legitimately return 402 is a useful first defence.
None of this needs Clampd. The protocols are open. The threats are observable. We picked this problem because the gap between "protocol exists" and "production-grade defense exists" is wider here than anywhere else in agent security right now.
Try Clampd in 60 seconds
One line of Python or TypeScript. Works with OpenAI, Anthropic, LangChain, CrewAI, Google ADK, and any MCP server. Self-hosted, source-available, no telemetry by default. AP2 and x402 enforcement is built in; configure boundaries per agent in the dashboard.
pip install clampd npm install @clampd/sdkGet Started → Why Clampd