Volume VII: Appendices & Resources
Reference materials — the complete attack catalog, detection signatures, tools, assessment templates, case studies, and glossary.
Appendix A: Top 25 Critical Techniques
The 25 highest-risk techniques across all 15 AATMF tactics, ranked by AATMF-R v3 score. All 25 score at Critical (250+). The full catalog of all 240 techniques is available in the tactic volumes.
| # | ID | Technique | Score |
|---|---|---|---|
| 1 | T14-AT-007 | Nation-State AI Warfare | 280 |
| 2 | T11-AT-016 | Tool-Induced SSRF & Local Resource | 275 |
| 3 | T6-AT-003 | Backdoor Insertion | 270 |
| 4 | T11-AT-015 | Autonomous Replication | 270 |
| 5 | T14-AT-005 | Critical Infrastructure Attacks | 270 |
| 6 | T14-AT-014 | Systemic Risk Creation | 270 |
| 7 | T11-AT-001 | Browser Automation Hijacking | 265 |
| 8 | T14-AT-001 | GPU Farm Hijacking | 265 |
| 9 | T14-AT-012 | Cloud Provider Exploitation | 265 |
| 10 | T6-AT-002 | Dataset Contamination | 260 |
| 11 | T11-AT-013 | Supply Chain Attacks via Agents | 260 |
| 12 | T13-AT-010 | Hardware Supply Chain | 260 |
| 13 | T14-AT-008 | Ransomware via AI Systems | 260 |
| 14 | T15-AT-015 | Insider Threat Recruitment | 260 |
| 15 | T11-AT-002 | Tool Chain Exploitation | 255 |
| 16 | T11-AT-014 | Physical World Interactions | 255 |
| 17 | T13-AT-001 | Model Repository Poisoning | 255 |
| 18 | T14-AT-004 | Market Manipulation via AI | 255 |
| 19 | T14-AT-013 | Economic Espionage | 255 |
| 20 | T6-AT-001 | Reward Hacking | 250 |
| 21 | T10-AT-012 | Secure Enclave Bypasses | 250 |
| 22 | T11-AT-008 | Credential Harvesting | 250 |
| 23 | T13-AT-006 | Checkpoint Poisoning | 250 |
| 24 | T14-AT-010 | Data Center Attacks | 250 |
| 25 | T15-AT-004 | Reviewer Bribery & Coercion | 250 |
Appendix B: Detection Signatures
YARA rules for content-level analysis and Sigma rules for log-level detection. These signatures can be deployed alongside existing security tooling.
signatures/
├── yara/
│ ├── t01-prompt-injection.yar
│ ├── t02-encoding-evasion.yar
│ ├── t09-multimodal-injection.yar
│ ├── t11-mcp-tool-poisoning.yar
│ ├── t13-supply-chain.yar
└── sigma/
├── t05-model-extraction.yml
├── t07-data-exfiltration.yml
├── t11-agent-anomaly.yml
└── t14-infrastructure.yml YARA Rules (Content Analysis)
Sigma Rules (Log Analysis)
Appendix C: Tools & Scripts Reference
| Tool | Purpose | Coverage | License |
|---|---|---|---|
| PromptGuard 2 (Meta) | Real-time prompt injection classifier | T1, T2, T9 | Apache 2.0 |
| LlamaFirewall (Meta) | Comprehensive AI firewall (input + agent + code) | T1, T2, T7, T11 | Apache 2.0 |
| CaMeL (Google DeepMind) | Dual-LLM architecture with capability-based access | T11 | Research |
| PEFTGuard (Open Source) | Backdoor detection in PEFT (LoRA) adapters | T13 | Open Source |
| DRS Defense (Research) | Data Randomized Smoothing for training poisoning | T6 | Research |
| SafeTensors (HuggingFace) | Safe model serialization format (no code execution) | T13 | Apache 2.0 |
| Garak (NVIDIA) | LLM vulnerability scanner | T1–T8 | Apache 2.0 |
| PyRIT (Microsoft) | Python Risk Identification Toolkit for generative AI | T1–T12 | MIT |
Appendix D: Assessment Templates
AI Security Assessment Checklist
Pre-Assessment
Assessment
Post-Assessment
Finding Report Template
# Finding: [Title]
## Classification
- AATMF Tactic: T[n] — [Name]
- AATMF Technique: T[n]-AT-[seq]
- Risk Score: [score] ([CRITICAL/HIGH/MEDIUM/LOW/INFO])
- CVSS v3.1: [score] (if applicable)
## Description
[Clear description of the vulnerability]
## Proof of Concept
[Steps to reproduce, including exact prompts/inputs]
## Impact
[Business and technical impact assessment]
## Mitigation
[Specific remediation steps]
## Compliance Mapping
- OWASP LLM Top 10: [LLM0x]
- MITRE ATLAS: [AML.Txxxx]
- EU AI Act: [Article] Appendix E: Case Studies
Real-world attacks and research findings from 2025–2026 that shaped AATMF v3.
E.1 Policy Puppetry — Universal Model Bypass HiddenLayer, April 2025 T1, T2, T3
Reformulating adversarial prompts as XML, INI, or JSON policy configuration files causes LLMs to interpret them as authoritative system-level instructions. Achieves universal bypass across GPT-4o, GPT-4.5, o1, o3-mini, Claude 3.5/3.7, Gemini 1.5/2.0/2.5, Llama 3/4, DeepSeek V3/R1, Qwen 2.5, and Mistral.
Key Insight
Models trained on technical documentation treat configuration-style formatting as high-authority context, overriding safety alignment.
E.2 Autonomous LRM Jailbreaking Nature Communications, August 2025 T3, T4
Four large reasoning models deployed as multi-turn adversarial agents against nine target models achieved 97.14% ASR. More capable reasoning models are paradoxically better at subverting alignment in others.
Key Insight
Reasoning capabilities are attack capabilities. This validates AATMF's prediction that LRM advancements would be weaponized.
E.3 PoisonedRAG USENIX, USENIX Security 2025 T12
Injecting as few as 5 adversarially crafted texts into a knowledge base with millions of clean documents controls the model's responses to specific target questions. ASR reached 99% on HotpotQA.
Key Insight
The semantic similarity search at the heart of RAG is fundamentally exploitable — the same mechanism that makes retrieval useful makes it poisonable.
E.4 MCP Tool Poisoning Invariant Labs, 2025 T11
84.2% ASR via direct tool description poisoning, shadow attacks (malicious server manipulates trusted tools without being invoked), and rug pull attacks (silently altering descriptions post-approval).
Key Insight
The MCP design — where tool descriptions are processed as natural language — is architecturally vulnerable to injection.
E.5 ShadowMQ — Copy-Pasted RCE Oligo Security, November 2025 T14
Unsafe ZeroMQ socket patterns were literally copy-pasted across major inference frameworks — vLLM, TensorRT-LLM, and Modular Max Server. Thousands of exposed ZMQ sockets found on the public internet.
Key Insight
AI infrastructure inherits all traditional software vulnerabilities, amplified by the speed of framework adoption and code reuse without security review.
E.6 250 Poisoned Documents — Universal Training Backdoor Turing Institute / Anthropic / UK AISI, October 2025 T6
Injecting just 250 specially crafted documents into training data backdoors models from 600M to 13B parameters trained on up to 260B tokens. The actual threshold for poisoning is negligibly small.
Key Insight
The sheer scale of pretraining data works against defenders. 250 documents in billions is a needle in a haystack that training cannot filter out.
Appendix F: Glossary
| Term | Definition |
|---|---|
| AATMF | Adversarial AI Threat Modeling Framework |
| ASR | Attack Success Rate — percentage of attempts that achieve the adversarial objective |
| CaMeL | CApability-Mediated LLM — Google DeepMind's dual-LLM security architecture |
| CoT | Chain-of-Thought — step-by-step reasoning in LLMs |
| DPO | Direct Preference Optimization — alignment training technique |
| DRS | Data Randomized Smoothing — defense against training data poisoning |
| H-CoT | Hijacked Chain-of-Thought — attack that subverts CoT safety reasoning |
| LRM | Large Reasoning Model — models with explicit reasoning capabilities (o1, o3, DeepSeek-R1) |
| MCP | Model Context Protocol — Anthropic's standard for tool integration |
| PEFT | Parameter-Efficient Fine-Tuning — techniques like LoRA for efficient model adaptation |
| RAG | Retrieval-Augmented Generation — architecture combining search with generation |
| RLHF | Reinforcement Learning from Human Feedback — primary alignment technique |
| SafeTensors | Secure model serialization format that prevents code execution |
| TEE | Trusted Execution Environment — hardware-based security enclave |
Key References
- HiddenLayer. "Policy Puppetry: A Universal Jailbreak." April 2025.
- Zeng et al. "Autonomous LRM Jailbreaking." Nature Communications, August 2025.
- Xue et al. "PoisonedRAG: Knowledge Corruption Attacks." USENIX Security 2025.
- Invariant Labs. "MCP-ITP: Tool Poisoning in Agentic Systems." April 2025.
- Oligo Security. "ShadowMQ: Unsafe Deserialization in AI Inference Frameworks." November 2025.
- Sherburn et al. "250 Documents: Universal Pretraining Backdoors." Turing Institute/Anthropic/UK AISI, October 2025.
- Anthropic. "GTG-1002: AI-Orchestrated Cyber Campaign." November 2025.
- Google DeepMind. "CaMeL: Defeating Prompt Injection by Design." March 2025.
- Meta. "LlamaFirewall: Open-Source AI Safety Framework." April 2025.
- MITRE. "ATLAS v4.6.0." October 2025.
- OWASP. "LLM Top 10 2025." January 2025.
- OWASP. "Agentic AI Top 10." December 2025.
- NIST. "Cyber AI Profile (IR 8596) Preliminary Draft." December 2025.
- European Parliament. "EU AI Act (Regulation 2024/1689)." 2024.
- Qi et al. "Safety Alignment Depth." Princeton, May 2025.
- Weng et al. "H-CoT: Hijacking Chain-of-Thought." Duke/Accenture, February 2025.
- Borghesi et al. "SACRED-Bench: Compositional Audio Attacks." November 2025.