AATMF (Adversarial AI Threat Modeling Framework) is an open framework for systematically threat-modeling AI systems. Version 3.1 contains 15 tactics, 240+ techniques, and 2,150+ procedures, mapped to NIST AI RMF and MITRE ATLAS.

How does AATMF compare to MITRE ATLAS?

AATMF maps to MITRE ATLAS but extends it: it includes per-procedure-level detail, an integrated risk-scoring system (AATMF-R), and explicit coverage of agentic, multi-turn, and supply-chain attack vectors that ATLAS treats more abstractly.

Is AATMF free to use?

Yes. AATMF is published openly under a permissive license. Citation is appreciated but not required for internal use.

snailsploit[$]

frameworks / aatmf

aatmf · v3

cc by-sa 4.0
mitre atlas v4.6.0
updated 2026-05-01

framework · v3

adversarial AI
threat modeling.

AATMF applies adversarial psychology to machine systems. It does for AI what MITRE ATT&CK does for enterprise networks — a common language, complete taxonomy, and actionable procedures for AI red teaming, threat modeling, and defense.

Traditional cybersecurity frameworks miss the attack surfaces unique to AI: prompt injection, training data poisoning, model extraction, agentic exploitation, RAG manipulation, and the human feedback loops that shape model behavior. AATMF fills that gap with a structured approach to LLM security testing.

at a glance

tactics15

techniques240

procedures2,152+

prompts4,980+

view on github →open prompt bank →quick start →

"AI systems are vulnerable to social engineering because they were trained to respond like humans. This is the first technology where human manipulation techniques directly translate to technical exploitation."
— core thesis · aatmf v3

00 · quick start

Pick the path that matches your task. Every link drops into operational material — checklists, playbooks, signature libraries.

quick start: ai red team paths.

01Understand the frameworkIntroduction & Architecture→02Run an AI red team assessmentRed Team Operations & Checklists→03Defend my AI systemBlue Team Defense & Mitigations→04Respond to an AI incidentIncident Response Playbooks→05Map to compliance· MITRE · NIST · EU AI Act→06Browse attack techniquesComplete Attack Catalog (240 techniques)→07Deploy detection signaturesYARA & Sigma Signatures Library→

01 · why v3

The threat surface shifted in 2025–2026. Every tactic updated. New operational volumes. Namespaced IDs eliminate prior collisions.

the threat surface shifted.

yearwhat happenedwhat it means

2025 · 01Policy Puppetry bypasses every frontier modelJailbreaking is now a commodity

2025 · 02Reasoning models autonomously jailbreak others — 97% ASRAI-vs-AI attacks are real

2025 · 03GTG-1002 — first state-sponsored AI-orchestrated cyberattackAgentic AI is weaponized

2025 · 04MCP tool poisoning — 84% ASR on production agentsTool ecosystems are attack surfaces

2026 · 05250 poisoned documents backdoor any model — any sizeTraining poisoning is trivially cheap

2026 · 06PoisonedRAG — 90% ASR with 5 injected textsRAG security is fundamentally broken

2026 · 07Deepfake fraud tripled to $1.1 billion in 2025Real-world harm at scale

02 · the 15 tactics

Three groupings: core model-level attacks, advanced attack surface (multimodal, agentic, RAG), and the systems-and-people layer around the model.

15 ai attack tactics.

core · t1–t8direct model-level attackstechproc

Prompt & Context Subversion

Manipulate model instructions and context. System prompt extraction, instruction hierarchy override, context window flooding, and delimiter exploitation. The foundational tactic — most attacks start here.

↗ prompt injection

1676

Semantic & Linguistic Evasion

Bypass filters through language manipulation. Encoding tricks, character substitution, multilingual pivots, homoglyph attacks, and obfuscation chains. The arms race between input filters and the creativity of natural language.

↗ LLM jailbreaking

20161

Reasoning & Constraint Exploitation

Exploit logical reasoning and constraints. Hypothetical framing, roleplay escalation, ethical dilemma construction, chain-of-thought manipulation, and recursive reasoning loops. Turns the model's own reasoning capabilities against its safety training.

19178

Multi-Turn & Memory Manipulation

Leverage conversation history and memory. Context poisoning across turns, memory injection, conversation state manipulation, and persistent backdoor establishment. Particularly dangerous in agents with long-term memory.

↗ memory manipulation

16147

Model & API Exploitation

Attack model interfaces and APIs. Parameter manipulation, token budget exhaustion, embedding space attacks, logit bias exploitation, and model fingerprinting. The technical attack surface beneath the natural language interface.

16142

Training & Feedback Poisoning

Corrupt training data and feedback loops. RLHF manipulation, preference poisoning, data injection during fine-tuning, and reward hacking. 250 poisoned documents can backdoor any model regardless of size.

15141

Output Manipulation & Exfiltration

Manipulate outputs and extract data. Steganographic encoding in responses, structured data leakage, gradual extraction through benign-looking queries, and output format exploitation.

15146

External Deception & Misinformation

Generate deceptive content at scale. Deepfake text generation, authority impersonation, citation fabrication, and automated disinformation pipelines. Deepfake fraud tripled to $1.1 billion in 2025.

15150

advanced · t9–t12tools, memory, autonomytechproc

Multimodal & Cross-Channel Attacks

Attack across modalities. Image-embedded prompts, audio adversarial examples, cross-modal injection, and OCR exploitation. The attack surface expands every time a model gains a new input type.

17147

T10

Integrity & Confidentiality Breach

Extract data and breach integrity. Training data extraction, membership inference, model inversion, and PII recovery from fine-tuned models. What the model learned, an attacker can sometimes unlearn.

15147

T11

Agentic & Orchestrator Exploitation

Attack autonomous agents and orchestrators. MCP tool poisoning (84% ASR on production agents), agent-to-agent manipulation, orchestrator confusion, and autonomous goal hijacking. The fastest-growing attack surface.

↗ MCP security analysis

16160

T12

RAG & Knowledge Base Manipulation

Poison retrieval systems. Document injection, embedding collision, knowledge base backdoors, and retrieval ranking manipulation. PoisonedRAG hits 90% ASR with 5 injected texts.

15149

infra & human · t13–t15systems and people around the modeltechproc

T13

AI Supply Chain & Artifact Trust

Compromise the model supply chain. Model repository poisoning, adapter backdoors, quantization attacks, and dependency confusion in ML pipelines.

15150

T14

Infrastructure & Economic Warfare

Attack AI infrastructure. Compute denial, API abuse for economic damage, model serving disruption, and resource exhaustion attacks.

15150

T15

Human Workflow Exploitation

Manipulate human reviewers and workflows. RLHF annotator manipulation, red team exhaustion, compliance theater exploitation, and safety review bypass through procedural gaming.

↗ human workflow exploitation (SEF)

15108

total15 tactics2402,152

03 · risk scoring

AATMF-R v3 scores every technique on six factors. The result is a comparable risk number across tactics, organizations, and time.

ai risk scoring · aatmf-r v3.

formula

Risk = (L × I × E) / 6 × (D / 6) × R × C

six factors

1–5

Likelihood

Probability of successful exploitation

1–5

Impact

Severity of successful attack

1–5

Exploitability

Ease of execution — skill, resources, access

1–5

Detectability

Difficulty of detection — 5 means nearly invisible

1–5

Recoverability

Effort to recover — 5 means irrecoverable

0.5–2.0

Cost factor

Economic impact multiplier

rating scale

250+Critical

200–249High

150–199Medium

100–149Low

0–99Informational

try the interactive calculator →

04 · architecture

Hierarchy and ID system. v3 namespacing eliminates the 43 ID collisions present in earlier versions.

framework architecture.

AATMF v3
├── 15 Tactics
│ ├── 240 Techniques
│ │ ├── 2,152+ Attack Procedures
│ │ │ └── 4,980+ Prompts
│ │ ├── Detection Patterns
│ │ └── Mitigation Controls
│ └── Risk Scoring (AATMF-R v3)
└── Supporting Infrastructure
 ├── Detection Signatures YARA · Sigma · MCP
 ├── Response Playbooks
 ├── Assessment Templates
 └── Compliance Mappings · ATLAS · NIST · EU AI Act

namespaced id system

Every technique and procedure now declares its parent tactic in the identifier. Tactic membership is visible at a glance; cross-version migrations are unambiguous.

T{n}-AT-{seq:03d}

Technique ID

T1-AT-001 · T11-AT-016

T{n}-AP-{seq}{L}

Attack Procedure

T1-AP-001A · T3-AP-010B

05 · cross-framework

AATMF maps directly to the standards enterprises already use. Additional mappings: Agentic Top 10 (Dec 2025), NIST AI RMF / IR 8596, EU AI Act risk categories, CWE / CVE.

cross-framework mapping: mitre atlas, nist.

aatmf tacticmitre atlas

T1 Prompt & Context SubversionAML.T0051 LLM Prompt InjectionLLM01 · LLM02 · LLM03 · LLM04 · LLM06 · LLM07 · LLM08 · LLM10

T2 Semantic & Linguistic EvasionAML.T0054 LLM JailbreakLLM01

T3 Reasoning & Constraint ExploitationAML.T0054.001–003LLM01

T4 Multi-Turn & Memory ManipulationAML.T0056 LLM Meta Prompt ExtractionLLM07

T5 Model & API ExploitationAML.T0044 Full ML Model Access—

T6 Training & Feedback PoisoningAML.T0020 Poison Training DataLLM04

T7 Output Manipulation & ExfiltrationAML.T0024.002 Exfil via Inference APILLM02 · LLM05

T8 External Deception & MisinformationAML.T0048 Societal HarmLLM05 · LLM09

T9 Multimodal & Cross-Channel AttacksAML.T0051 cross-modal variantsLLM01

T10 Integrity & Confidentiality BreachAML.T0024 Exfil via Cyber MeansLLM02

T11 Agentic & Orchestrator ExploitationAML.T0057 LLM Agent AbuseLLM06

T12 RAG & Knowledge Base ManipulationAML.T0058 RAG PoisoningLLM04 · LLM08

T13 AI Supply Chain & Artifact TrustAML.T0010 ML Supply Chain CompromiseLLM03

T14 Infrastructure & Economic WarfareAML.T0029 Denial of ML ServiceLLM10

T15 Human Workflow ExploitationAML.T0048.004 Reputational Harm—

full compliance mapping →

06 · documentation

Seven volumes. Read sequentially or jump to the operational material in V — the rest is reference.

aatmf documentation: 7 volumes.

Framework Foundations

Methodology, risk assessment (AATMF-R v3), and framework architecture. Start here to understand structure, scoring, and how tactics chain together.

→

Core Attack Tactics · T1–T8

Prompt subversion, semantic evasion, reasoning exploitation, memory manipulation, API attacks, training poisoning, output exfiltration, and deception.

→

III

Advanced Attack Tactics · T9–T12

Multimodal attacks, integrity breaches, agentic exploitation, and RAG manipulation. The attack surface that emerged as models gained tools, memory, and autonomy.

→

Infrastructure & Human · T13–T15

Supply chain compromise, infrastructure warfare, and human workflow exploitation. Tactics that target the systems and people around the model, not the model itself.

→

Implementation & Operations

Detection engineering, mitigation strategies, incident response playbooks, and red/blue team operations. How to operationalize AATMF.

→

Governance & Compliance

Risk management framework, compliance mapping to / MITRE / NIST / EU AI Act, and training programs.

→

VII

Appendices & Resources

Complete catalog of all 240 techniques, detection signatures (YARA / Sigma / MCP), assessment templates, case studies, and glossary.

→

For automated testing, see the AATMF Toolkit — a Python CLI that runs procedures against any LLM endpoint and emits AATMF-R-scored reports.

get the red-card starter pack

10 ready-to-run
red team scenarios.

Evaluation scenarios for testing AI systems against common attack vectors. YAML templates drop straight into CI/CD. Mapped to and MITRE ATLAS so the output reads in your existing review process.

+10 ready-to-run red team scenarios
+YAML templates for CI/CD pipelines
+Risk scoring worksheets (AATMF-R v3)
+Mapped to and MITRE ATLAS

aatmf · red-card · starter v3

no spam. unsubscribe anytime. starter pack is CC BY-SA 4.0.

08 · cite

License is CC BY-SA 4.0. Use, modify, and share with attribution. The framework is open source — pull requests welcome.

citation & source.

@misc{aizen2026aatmf,
 title = {AATMF v3: Adversarial AI Threat Modeling Framework},
 author = {Aizen, Kai},
 year = {2026},
 url = {https://github.com/snailsploit/aatmf},
 note = {15 tactics, 240 techniques, 2,152+ procedures}
}

license

CC BY-SA 4.0

source

github.com/SnailSploit/AATMF

creator of aatmf · author of adversarial minds · nvd contributor

more frameworks all frameworks →

SEF →Social engineering framework P.R.O.M.P.T →Compositional grammar Claude-Red →Skills library Toolkit →LLM safety CLI Playbook →Diagnostic methodology

adversarial AIthreat modeling.

quick start: ai red team paths.

the threat surface shifted.

15 ai attack tactics.

ai risk scoring · aatmf-r v3.

framework architecture.

cross-framework mapping: mitre atlas, nist.

aatmf documentation: 7 volumes.

10 ready-to-runred team scenarios.

related frameworks & research.

citation & source.

adversarial AI
threat modeling.

10 ready-to-run
red team scenarios.