Defense-in-depth for tool-using agents: trust boundaries, tool-call auditing, memory hygiene, and human-in-loop on high-risk paths.
User text, retrieved docs, tool outputs, agent-to-agent messages — all of it can carry an injected instruction. Tag origin and minimize cross-trust mixing.
Each agent gets the minimum tools it needs. No general 'shell' or 'execute' tools without isolation. Each tool gets parameter validation and an output-trust label.
Log inputs, outputs, and the prompt context that triggered each call. Anomaly-detect on call frequency, parameter content, and chain depth.
Memory writes are an attack vector — see /ai-security/self-replicating-memory-worm/. Validate writes; tag origin; expire suspicious entries.
High-risk actions (file writes, network calls, financial ops) require human-in-loop. No autonomous escalation.
Tactic 11 (Agentic & Orchestrator Exploitation) has 16 techniques specifically for agents. Run them against your stack quarterly.