snailsploit[$]
frameworks / aatmf
aatmf · v3
cc by-sa 4.0
mitre atlas v4.6.0
updated 2026-05-01
framework · v3

adversarial AI
threat modeling.

AATMF applies adversarial psychology to machine systems. It does for AI what MITRE ATT&CK does for enterprise networks — a common language, complete taxonomy, and actionable procedures for AI red teaming, threat modeling, and defense.

Traditional cybersecurity frameworks miss the attack surfaces unique to AI: prompt injection, training data poisoning, model extraction, agentic exploitation, RAG manipulation, and the human feedback loops that shape model behavior. AATMF fills that gap with a structured approach to LLM security testing.

at a glance
tactics15
techniques240
procedures2,152+
prompts4,980+
"AI systems are vulnerable to social engineering because they were trained to respond like humans. This is the first technology where human manipulation techniques directly translate to technical exploitation."
— core thesis · aatmf v3
00 · quick start
Pick the path that matches your task. Every link drops into operational material — checklists, playbooks, signature libraries.

quick start: ai red team paths.

01 · why v3
The threat surface shifted in 2025–2026. Every tactic updated. New operational volumes. Namespaced IDs eliminate prior collisions.

the threat surface shifted.

yearwhat happenedwhat it means
2025 · 01Policy Puppetry bypasses every frontier modelJailbreaking is now a commodity
2025 · 02Reasoning models autonomously jailbreak others — 97% ASRAI-vs-AI attacks are real
2025 · 03GTG-1002 — first state-sponsored AI-orchestrated cyberattackAgentic AI is weaponized
2025 · 04MCP tool poisoning — 84% ASR on production agentsTool ecosystems are attack surfaces
2026 · 05250 poisoned documents backdoor any model — any sizeTraining poisoning is trivially cheap
2026 · 06PoisonedRAG — 90% ASR with 5 injected textsRAG security is fundamentally broken
2026 · 07Deepfake fraud tripled to $1.1 billion in 2025Real-world harm at scale
02 · the 15 tactics
Three groupings: core model-level attacks, advanced attack surface (multimodal, agentic, RAG), and the systems-and-people layer around the model.

15 ai attack tactics.

core · t1–t8direct model-level attackstechproc
T1
Prompt & Context Subversion

Manipulate model instructions and context. System prompt extraction, instruction hierarchy override, context window flooding, and delimiter exploitation. The foundational tactic — most attacks start here.

1676
T2
Semantic & Linguistic Evasion

Bypass filters through language manipulation. Encoding tricks, character substitution, multilingual pivots, homoglyph attacks, and obfuscation chains. The arms race between input filters and the creativity of natural language.

20161
T3
Reasoning & Constraint Exploitation

Exploit logical reasoning and constraints. Hypothetical framing, roleplay escalation, ethical dilemma construction, chain-of-thought manipulation, and recursive reasoning loops. Turns the model's own reasoning capabilities against its safety training.

19178
T4
Multi-Turn & Memory Manipulation

Leverage conversation history and memory. Context poisoning across turns, memory injection, conversation state manipulation, and persistent backdoor establishment. Particularly dangerous in agents with long-term memory.

16147
T5
Model & API Exploitation

Attack model interfaces and APIs. Parameter manipulation, token budget exhaustion, embedding space attacks, logit bias exploitation, and model fingerprinting. The technical attack surface beneath the natural language interface.

16142
T6
Training & Feedback Poisoning

Corrupt training data and feedback loops. RLHF manipulation, preference poisoning, data injection during fine-tuning, and reward hacking. 250 poisoned documents can backdoor any model regardless of size.

15141
T7
Output Manipulation & Exfiltration

Manipulate outputs and extract data. Steganographic encoding in responses, structured data leakage, gradual extraction through benign-looking queries, and output format exploitation.

15146
T8
External Deception & Misinformation

Generate deceptive content at scale. Deepfake text generation, authority impersonation, citation fabrication, and automated disinformation pipelines. Deepfake fraud tripled to $1.1 billion in 2025.

15150
advanced · t9–t12tools, memory, autonomytechproc
T9
Multimodal & Cross-Channel Attacks

Attack across modalities. Image-embedded prompts, audio adversarial examples, cross-modal injection, and OCR exploitation. The attack surface expands every time a model gains a new input type.

17147
T10
Integrity & Confidentiality Breach

Extract data and breach integrity. Training data extraction, membership inference, model inversion, and PII recovery from fine-tuned models. What the model learned, an attacker can sometimes unlearn.

15147
T11
Agentic & Orchestrator Exploitation

Attack autonomous agents and orchestrators. MCP tool poisoning (84% ASR on production agents), agent-to-agent manipulation, orchestrator confusion, and autonomous goal hijacking. The fastest-growing attack surface.

16160
T12
RAG & Knowledge Base Manipulation

Poison retrieval systems. Document injection, embedding collision, knowledge base backdoors, and retrieval ranking manipulation. PoisonedRAG hits 90% ASR with 5 injected texts.

15149
infra & human · t13–t15systems and people around the modeltechproc
T13
AI Supply Chain & Artifact Trust

Compromise the model supply chain. Model repository poisoning, adapter backdoors, quantization attacks, and dependency confusion in ML pipelines.

15150
T14
Infrastructure & Economic Warfare

Attack AI infrastructure. Compute denial, API abuse for economic damage, model serving disruption, and resource exhaustion attacks.

15150
T15
Human Workflow Exploitation

Manipulate human reviewers and workflows. RLHF annotator manipulation, red team exhaustion, compliance theater exploitation, and safety review bypass through procedural gaming.

15108
total15 tactics2402,152
03 · risk scoring
AATMF-R v3 scores every technique on six factors. The result is a comparable risk number across tactics, organizations, and time.

ai risk scoring · aatmf-r v3.

formula
Risk  =  (L × I × E) / 6  ×  (D / 6)  ×  R  ×  C
six factors
L
1–5
Likelihood

Probability of successful exploitation

I
1–5
Impact

Severity of successful attack

E
1–5
Exploitability

Ease of execution — skill, resources, access

D
1–5
Detectability

Difficulty of detection — 5 means nearly invisible

R
1–5
Recoverability

Effort to recover — 5 means irrecoverable

C
0.5–2.0
Cost factor

Economic impact multiplier

rating scale
250+Critical
200–249High
150–199Medium
100–149Low
0–99Informational
04 · architecture
Hierarchy and ID system. v3 namespacing eliminates the 43 ID collisions present in earlier versions.

framework architecture.

AATMF v3
├── 15 Tactics
│ ├── 240 Techniques
│ │ ├── 2,152+ Attack Procedures
│ │ │ └── 4,980+ Prompts
│ │ ├── Detection Patterns
│ │ └── Mitigation Controls
│ └── Risk Scoring (AATMF-R v3)
└── Supporting Infrastructure
 ├── Detection Signatures YARA · Sigma · MCP
 ├── Response Playbooks
 ├── Assessment Templates
 └── Compliance Mappings · ATLAS · NIST · EU AI Act
namespaced id system

Every technique and procedure now declares its parent tactic in the identifier. Tactic membership is visible at a glance; cross-version migrations are unambiguous.

T{n}-AT-{seq:03d}
Technique ID
T1-AT-001 · T11-AT-016
T{n}-AP-{seq}{L}
Attack Procedure
T1-AP-001A · T3-AP-010B
05 · cross-framework
AATMF maps directly to the standards enterprises already use. Additional mappings: Agentic Top 10 (Dec 2025), NIST AI RMF / IR 8596, EU AI Act risk categories, CWE / CVE.

cross-framework mapping: mitre atlas, nist.

aatmf tacticmitre atlas
T1 Prompt & Context SubversionAML.T0051 LLM Prompt InjectionLLM01 · LLM02 · LLM03 · LLM04 · LLM06 · LLM07 · LLM08 · LLM10
T2 Semantic & Linguistic EvasionAML.T0054 LLM JailbreakLLM01
T3 Reasoning & Constraint ExploitationAML.T0054.001–003LLM01
T4 Multi-Turn & Memory ManipulationAML.T0056 LLM Meta Prompt ExtractionLLM07
T5 Model & API ExploitationAML.T0044 Full ML Model Access
T6 Training & Feedback PoisoningAML.T0020 Poison Training DataLLM04
T7 Output Manipulation & ExfiltrationAML.T0024.002 Exfil via Inference APILLM02 · LLM05
T8 External Deception & MisinformationAML.T0048 Societal HarmLLM05 · LLM09
T9 Multimodal & Cross-Channel AttacksAML.T0051 cross-modal variantsLLM01
T10 Integrity & Confidentiality BreachAML.T0024 Exfil via Cyber MeansLLM02
T11 Agentic & Orchestrator ExploitationAML.T0057 LLM Agent AbuseLLM06
T12 RAG & Knowledge Base ManipulationAML.T0058 RAG PoisoningLLM04 · LLM08
T13 AI Supply Chain & Artifact TrustAML.T0010 ML Supply Chain CompromiseLLM03
T14 Infrastructure & Economic WarfareAML.T0029 Denial of ML ServiceLLM10
T15 Human Workflow ExploitationAML.T0048.004 Reputational Harm
06 · documentation
Seven volumes. Read sequentially or jump to the operational material in V — the rest is reference.

aatmf documentation: 7 volumes.

I
Framework Foundations

Methodology, risk assessment (AATMF-R v3), and framework architecture. Start here to understand structure, scoring, and how tactics chain together.

II
Core Attack Tactics · T1–T8

Prompt subversion, semantic evasion, reasoning exploitation, memory manipulation, API attacks, training poisoning, output exfiltration, and deception.

III
Advanced Attack Tactics · T9–T12

Multimodal attacks, integrity breaches, agentic exploitation, and RAG manipulation. The attack surface that emerged as models gained tools, memory, and autonomy.

IV
Infrastructure & Human · T13–T15

Supply chain compromise, infrastructure warfare, and human workflow exploitation. Tactics that target the systems and people around the model, not the model itself.

V
Implementation & Operations

Detection engineering, mitigation strategies, incident response playbooks, and red/blue team operations. How to operationalize AATMF.

VI
Governance & Compliance

Risk management framework, compliance mapping to / MITRE / NIST / EU AI Act, and training programs.

VII
Appendices & Resources

Complete catalog of all 240 techniques, detection signatures (YARA / Sigma / MCP), assessment templates, case studies, and glossary.

For automated testing, see the AATMF Toolkit — a Python CLI that runs procedures against any LLM endpoint and emits AATMF-R-scored reports.
get the red-card starter pack

10 ready-to-run
red team scenarios.

Evaluation scenarios for testing AI systems against common attack vectors. YAML templates drop straight into CI/CD. Mapped to and MITRE ATLAS so the output reads in your existing review process.

  • +10 ready-to-run red team scenarios
  • +YAML templates for CI/CD pipelines
  • +Risk scoring worksheets (AATMF-R v3)
  • +Mapped to and MITRE ATLAS
aatmf · red-card · starter v3
email
no spam. unsubscribe anytime. starter pack is CC BY-SA 4.0.
08 · cite
License is CC BY-SA 4.0. Use, modify, and share with attribution. The framework is open source — pull requests welcome.

citation & source.

@misc{aizen2026aatmf,
 title = {AATMF v3: Adversarial AI Threat Modeling Framework},
 author = {Aizen, Kai},
 year = {2026},
 url = {https://github.com/snailsploit/aatmf},
 note = {15 tactics, 240 techniques, 2,152+ procedures}
}
license
CC BY-SA 4.0
creator of aatmf · author of adversarial minds · nvd contributor
more frameworks all frameworks →
SEF →Social engineering frameworkP.R.O.M.P.T →Compositional grammarClaude-Red →Skills libraryToolkit →LLM safety CLIPlaybook →Diagnostic methodology