Security

Six primitives for production agents: PII redaction, injection detection, rate limiting, audit log, sandbox enforcement, and HITL approvals.

Agents that reach production face threats unit tests don't cover — sensitive data leaking into logs, user inputs that hijack the system prompt, runaway API costs, and tool calls that modify infrastructure. These primitives address each class of risk at the boundary closest to where it appears.

For the methodology behind them — threat modeling, when to apply each, team practices — see the Playbook's security pillar.

#Primitives

PII redaction — createPIIRedactor + DEFAULT_PII_RULES. Recipe.
Prompt injection detector — heuristics + pluggable model classifier. Recipe.
Rate limiting — token-bucket by user / IP / key. Recipe.
Signed audit log — hash-chain + HMAC. Recipe.
Mandatory sandbox — allow / deny / require / validators across every tool. Recipe.
Human-in-the-loop approvals — pause / resume / approve with persisted state. Recipe.

Explore nearby

✎ Edit this page on GitHub·Found a problem? Open an issue →·How to contribute →

Security

#Primitives

#Related

Explore nearby

On this page