Posts on tool-call security, MCP supply chain risk, scope tokens, agent payments, and what we learn shipping Clampd. No hype. No listicles. No "10 ways to secure your LLM."
A researcher turned a GitHub PR title, issue body, and comment into a prompt injection that hijacked Claude Code, Gemini CLI, and GitHub Copilot in GitHub Actions, then made them dump the workflow's secrets. Anthropic rated its variant CVSS 9.4 Critical. It can't be patched inside the agent, because reading the comment is the agent's job. We walk the attack chain, show where a tool-call firewall catches the exfiltration on its way out, and give the three-line clampd-action setup, with an honest note on what it does and doesn't cover.
Read post →In May 2025, Invariant Labs showed a single malicious GitHub issue could make an AI agent leak private repository contents into a public pull request. It's not a GitHub bug but an architectural "toxic flow" of individually-authorized tool calls. We break down the attack, why prompt-injection filters miss it, and the multi-step session pattern that catches the read-then-exfiltrate sequence.
Read post →A drop-in onboarding guide for engineers running LangChain, CrewAI, OpenAI tool-use, Anthropic, or Google ADK in production. One decorator. No agent loop rewrite. Includes a live risk feed snapshot (descriptor_hash_mismatch, task_replay 0.90, non-ASCII agent IDs blocked at registry), the dashboard workflow lock/unlock/approve cycle on a real cluster, cryptographic signed-delegation enforcement with 4 live test scenarios (chain_hash_mismatch, jwt_invalid, missing-proof), and an 8-second end-to-end kill cascade walking a 3-agent tree.
Read post →1054 attack cases + 91 benign-API cases. Five runs. Baseline 72.87%. We almost shipped 85.39% but it overfit the benchmark, then 81.21% but it FPed at 11% on legitimate business content. Tier-split weights landed at 79.13% TPR with 0% measured false-positive rate from our new rules. The whole journey, the deployed code, and the runners you can execute yourself.
Read post →The biggest MCP security gap nobody is talking about. An MCP server can advertise one tool schema during discovery, get approved, then mutate that schema on a future deploy. Your agent doesn't notice. Your LLM doesn't notice either. We walk through the attack and how a 64-character SHA-256 descriptor hash fixes it.
Read post →44µs to evaluate 263 rules in-process. 5.14ms p50 / 7.92ms p95 end-to-end on a 2-vCPU production box, deny path. Hardware specs, sample sizes, what we couldn't measure, and the bench script you can run yourself. We also caught ourselves saying "sub-10ms typical" on the marketing site, which the data didn't support.
Read post →Two payment protocols built for AI agents shipped specs in the last year: Google's AP2 and the open x402 HTTP 402 standard. Both let an agent move real money. We walk through the seven threat patterns nobody's wired up for (mandate replay, chain swap, payee swap), and what we built into the Clampd gateway to enforce policy at the protocol layer.
Read post →Single-call inspection misses scrape-then-exfiltrate, slow-burn data pulls, sawtooth evasion, and cross-agent privilege escalation. Here are the 16 cross-call patterns we run on every tool call, what each fires on, and the worked example of a slow-drip exfiltration that survives per-call defence and dies at the session layer.
Read post →OWASP LLM Top 10 mapped category-by-category to the 263 runtime detection rules in our engine. Per-category coverage, the gaps, the 2025 reshuffle, and a paste-able paragraph for vendor questionnaires.
Read post →Hybrid security: regex for the obvious, LLM judge for the gray zone (default 0.2–0.75). The four conditions under which the judge does NOT fire: disabled, no API key, cooldown after consecutive failures, per-minute rate limit. Why fail-open is the default, and what the judge actually sees.
Read post →An AI agent tool call has two PII boundaries. Most products watch one. One major cloud gateway documents the gap explicitly: its PII filter does not detect PII in tool-use output parameters. We walk through the bidirectional model and the scenario where direction-1-only inspection silently fails.
Read post →Detection without response is just logging. The 8-layer cascade we run when an agent gets killed: deny list, NATS broadcast, token cache flush, session termination, IdP revoke, registry update, event broadcast, audit log. Each independent, fully idempotent. Latency budget per layer with the actual timings from cascade.rs.
An open LLM red team payloads corpus we run against every Clampd build: 85 prompt-injection variants, 67 exfil patterns, 56 SQL injection, 55 RCE, 52 LFI, 47 encoding evasion, 45 XSS, 42 SSRF, plus 40 deliberately safe inputs to catch false positives. Sources: OWASP, SecLists, Garak, Promptfoo, PayloadsAllTheThings, our own.
Read post →Almost every AI agent in production today holds raw downstream credentials. We mint a short-lived (5 min) Ed25519-signed token per tool call, bound to the (tool, params) via SHA-256, verified by the tool through JWKS. A captured token can't be replayed against a different tool, different params, or after the TTL.
Read post →