Part 1: Introduction and Methodology
The Critical Need for AI Threat Modeling
Artificial intelligence has transitioned from research curiosity to critical infrastructure. Language models process medical queries, legal documents, financial transactions, and government communications. Yet the security frameworks designed to protect these systems were built for a fundamentally different paradigm.
Traditional cybersecurity operates on deterministic logic: inputs produce predictable outputs, vulnerabilities have defined boundaries, and exploits follow reproducible steps. AI systems break every one of these assumptions. They are probabilistic, context-dependent, and — critically — trained on human language, making them susceptible to the same manipulation techniques that have been used against humans for millennia.
This is the core thesis of AATMF: AI systems are vulnerable to social engineering because they were trained to respond like humans. This is the first technology where human manipulation techniques directly translate to technical exploitation.
Genesis and Evolution
| Version |
Date |
Scope |
| v1.0 |
2024 |
Initial framework, 8 tactics |
| v2.0 |
Late 2024 |
Expanded to 12 tactics, added risk scoring |
| v3 |
February 2026 |
15 tactics, 240 techniques, 2,152+ procedures, namespaced IDs, Volumes V–VII, 2025–2026 threat integration |
Scope
AATMF covers adversarial threats against:
- Large Language Models (LLMs) and Large Reasoning Models (LRMs)
- Multimodal models (vision, audio, video)
- Retrieval-Augmented Generation (RAG) systems
- Autonomous AI agents and multi-agent orchestrators
- AI development and deployment infrastructure
- Human-in-the-loop workflows
- AI supply chains (models, datasets, tools, libraries)
Threat Actor Taxonomy
| Actor |
Motivation |
Typical Tactics |
Sophistication |
| Script kiddies |
Curiosity, clout |
T1, T2 |
Low |
| Bug bounty hunters |
Financial reward |
T1–T5, T10 |
Medium–High |
| Cybercriminals |
Financial gain |
T1–T3, T7–T8, T13 |
Medium |
| Corporate espionage |
Competitive advantage |
T5, T10, T13–T14 |
High |
| Nation-state actors |
Strategic advantage |
T6, T11, T13–T15 |
Very High |
| AI red teams |
Security improvement |
All |
Very High |
| Insiders |
Various |
T6, T15 |
Variable |
Methodology
Each technique in AATMF is documented with:
- Unique namespaced identifier —
T{tactic}-AT-{sequence:03d}
- Risk score — Computed via AATMF-R v3 six-factor formula
- Attack procedures — Concrete implementation variants with example prompts
- Detection patterns — Signatures and heuristics for identifying the technique
- Mitigation controls — Defensive measures mapped to the technique
- Cross-framework references — Mappings to MITRE ATLAS, OWASP, NIST, EU AI Act
Part 2: Risk Assessment Methodology (AATMF-R v3)
Formula
Risk = (L × I × E) / 6 × (D / 6) × R × C
Factors
| Factor |
Symbol |
Range |
Description |
| Likelihood |
L |
1–5 |
Probability of successful exploitation |
| Impact |
I |
1–5 |
Severity of successful attack |
| Exploitability |
E |
1–5 |
Ease of execution (skill, resources, access required) |
| Detectability |
D |
1–5 |
Difficulty of detection (5 = nearly invisible) |
| Recoverability |
R |
1–5 |
Effort to recover (5 = irrecoverable) |
| Cost Factor |
C |
0.5–2.0 |
Economic impact multiplier |
Scoring Guidelines
Likelihood (L)
| Score |
Label |
Criteria |
| 1 |
Rare |
Requires novel research, no known PoC |
| 2 |
Unlikely |
Requires specialized knowledge |
| 3 |
Possible |
Known technique, moderate skill required |
| 4 |
Likely |
Well-documented, readily available tools |
| 5 |
Almost Certain |
Automated, commodity attack |
Impact (I)
| Score |
Label |
Criteria |
| 1 |
Negligible |
Minor policy violation, no data exposure |
| 2 |
Minor |
Limited harmful content, no sensitive data |
| 3 |
Moderate |
Sensitive data exposure, service degradation |
| 4 |
Major |
Critical data breach, safety bypass, service outage |
| 5 |
Catastrophic |
Physical harm potential, mass data breach, systemic compromise |
Exploitability (E)
| Score |
Label |
Criteria |
| 1 |
Theoretical |
Requires custom research and novel techniques |
| 2 |
Difficult |
Needs deep expertise and specific conditions |
| 3 |
Moderate |
Documented approach, some skill required |
| 4 |
Easy |
Copy-paste attacks, minimal customization |
| 5 |
Trivial |
Automated tools, zero skill required |
Detectability (D)
| Score |
Label |
Criteria |
| 1 |
Obvious |
Trivially detected by basic filters |
| 2 |
Easy |
Standard monitoring catches it |
| 3 |
Moderate |
Requires specialized detection |
| 4 |
Difficult |
Advanced analysis needed |
| 5 |
Nearly Invisible |
No reliable detection method exists |
Recoverability (R)
| Score |
Label |
Criteria |
| 1 |
Immediate |
Auto-recoverable, no intervention needed |
| 2 |
Quick |
Simple rollback or reset |
| 3 |
Moderate |
Requires investigation and manual remediation |
| 4 |
Difficult |
Extended downtime, data loss possible |
| 5 |
Irrecoverable |
Permanent damage, no full recovery path |
Cost Factor (C)
| Range |
Criteria |
| 0.5 |
Minimal economic impact, internal only |
| 1.0 |
Standard business impact |
| 1.5 |
Significant financial or reputational damage |
| 2.0 |
Catastrophic economic consequences |
Risk Rating Scale
| Score |
Rating |
Color |
Action Required |
| 250+ |
CRITICAL |
🔴 |
Immediate remediation required |
| 200–249 |
HIGH |
🟠 |
Remediation within current sprint |
| 150–199 |
MEDIUM |
🟡 |
Scheduled remediation |
| 100–149 |
LOW |
🔵 |
Risk accepted or monitored |
| 0–99 |
INFO |
⚪ |
Documented, no action required |
Example Calculation
T1-AT-001 — Instruction Override Injection
| Factor |
Score |
Rationale |
| Likelihood |
5 |
Commodity attack, automated tools exist |
| Impact |
4 |
Complete safety bypass |
| Exploitability |
5 |
Copy-paste, zero skill |
| Detectability |
3 |
Pattern-matchable but evolving |
| Recoverability |
2 |
Session-scoped, no persistent damage |
| Cost Factor |
1.5 |
Brand and regulatory risk |
Risk = (5 × 4 × 5) / 6 × (3 / 6) × 2 × 1.5
= 100/6 × 0.5 × 2 × 1.5
= 16.67 × 0.5 × 2 × 1.5
= 25.0
Note: Scores vary based on deployment context. A chatbot vs. an autonomous financial agent would score very differently on Impact and Cost Factor.
← Part 1 · Home · Part 3: Architecture →
Part 3: Framework Architecture
Hierarchical Structure
AATMF v3
├── 15 Tactics (high-level adversarial objectives)
│ ├── 240 Techniques (specific attack methods)
│ │ ├── 2,152+ Attack Procedures (implementation variants)
│ │ │ └── 4,980+ Prompts (actual attack examples)
│ │ ├── Detection Patterns
│ │ └── Mitigation Controls
│ └── Risk Scoring (AATMF-R v3)
└── Cross-Framework Mappings
├── MITRE ATLAS v4.6.0
├── OWASP LLM Top 10 2025
├── NIST AI RMF / IR 8596
└── EU AI Act
Namespaced Identifier System
v3 introduces namespaced identifiers to eliminate AT-ID collisions:
| Element |
Format |
Example |
| Tactic |
T{n} |
T1, T15 |
| Technique |
T{n}-AT-{seq:03d} |
T1-AT-001, T11-AT-016 |
| Attack Procedure |
T{n}-AP-{seq}{letter} |
T1-AP-001A, T3-AP-010B |
Why Namespacing?
In v3.0, AT-010 referred to "Dialogue Hijacking" in T1 and "Euphemism Exploitation" in T2 — completely different techniques sharing the same ID. Across all 15 tactics, 43 such collisions existed. The namespaced system guarantees every identifier is globally unique while preserving tactic membership at a glance.
Cross-Framework Mappings
MITRE ATLAS v4.6.0 (October 2025)
| AATMF Tactic |
Primary ATLAS Mapping |
| T1 — Prompt Subversion |
AML.T0051 LLM Prompt Injection |
| T2 — Semantic Evasion |
AML.T0054 LLM Jailbreak |
| T3 — Reasoning Exploitation |
AML.T0054.001–003 |
| T4 — Multi-Turn |
AML.T0056 LLM Meta Prompt Extraction |
| T5 — Model/API Exploitation |
AML.T0044 Full ML Model Access |
| T6 — Training Poisoning |
AML.T0020 Poison Training Data |
| T7 — Output Manipulation |
AML.T0024.002 Exfiltration via ML Inference API |
| T8 — Deception |
AML.T0048 Societal Harm |
| T9 — Multimodal |
AML.T0051 (cross-modal variants) |
| T10 — Integrity Breach |
AML.T0024 Exfiltration via Cyber Means |
| T11 — Agentic |
AML.T0057 LLM Agent Abuse |
| T12 — RAG Manipulation |
AML.T0058 RAG Poisoning |
| T13 — Supply Chain |
AML.T0010 ML Supply Chain Compromise |
| T14 — Infrastructure |
AML.T0029 Denial of ML Service |
| T15 — Human Workflow |
AML.T0048.004 Reputational Harm |
OWASP LLM Top 10 2025
| OWASP Entry |
AATMF Coverage |
| LLM01: Prompt Injection |
T1, T2, T3, T9 |
| LLM02: Sensitive Information Disclosure |
T7, T10 |
| LLM03: Supply Chain Vulnerabilities |
T13 |
| LLM04: Data and Model Poisoning |
T6, T12 |
| LLM05: Improper Output Handling |
T7, T8 |
| LLM06: Excessive Agency |
T11 |
| LLM07: System Prompt Leakage |
T1, T4 |
| LLM08: Vector and Embedding Weaknesses |
T12 |
| LLM09: Misinformation |
T8 |
| LLM10: Unbounded Consumption |
T14 |
Tactic Overview
| ID |
Tactic |
Techniques |
Procedures |
| T1 |
Prompt & Context Subversion |
16 |
76 |
| T2 |
Semantic & Linguistic Evasion |
20 |
161 |
| T3 |
Reasoning & Constraint Exploitation |
19 |
178 |
| T4 |
Multi-Turn & Memory Manipulation |
16 |
147 |
| T5 |
Model & API Exploitation |
16 |
142 |
| T6 |
Training & Feedback Poisoning |
15 |
141 |
| T7 |
Output Manipulation & Exfiltration |
15 |
146 |
| T8 |
External Deception & Misinformation |
15 |
150 |
| T9 |
Multimodal & Cross-Channel Attacks |
17 |
147 |
| T10 |
Integrity & Confidentiality Breach |
15 |
147 |
| T11 |
Agentic & Orchestrator Exploitation |
16 |
160 |
| T12 |
RAG & Knowledge Base Manipulation |
15 |
149 |
| T13 |
AI Supply Chain & Artifact Trust |
15 |
150 |
| T14 |
Infrastructure & Economic Warfare |
15 |
150 |
| T15 |
Human Workflow Exploitation |
15 |
108 |
|
Total |
240 |
2,152+ |
← Part 2 · Home · Volume II: Core Tactics →
"left">
Total |
240 |
2,152+ |
← Part 2 · Home · Volume II: Core Tactics →