AI Security Wiki
Reference for adversarial artificial intelligence, LLM security vulnerabilities, prompt injection attacks, and AI red teaming methodologies.
What Is AI Security?
AI security encompasses the practices, methodologies, and technologies used to protect artificial intelligence systems from adversarial manipulation, unauthorized access, and malicious exploitation. As AI systems become deeply embedded in critical infrastructure, financial services, healthcare, and national security applications, securing these systems has evolved from an academic curiosity into an operational imperative.
Unlike traditional software security, AI security must contend with systems that learn, adapt, and make decisions based on patterns in data rather than explicit programming logic. This fundamental difference creates entirely new attack surfaces. An attacker doesn't need to find a buffer overflow or SQL injection vulnerability—they can manipulate the model's behavior through carefully crafted inputs, poisoned training data, or exploitation of the model's learned assumptions.
The field sits at the intersection of machine learning, cybersecurity, and adversarial research. Practitioners must understand both how AI systems work internally and how attackers think about exploiting them.
Why This Wiki Exists
The AI security landscape is fragmented. Research papers are locked behind academic paywalls. Vendor documentation focuses on their specific tools. Blog posts vary wildly in quality and accuracy. Security teams trying to assess AI risks find themselves piecing together information from dozens of sources, many of which contradict each other.
This wiki provides a single authoritative reference—built by practitioners, grounded in real-world testing, and continuously updated as the threat landscape evolves.
Navigating the Wiki
Concepts
Foundational definitions and theoretical frameworks. Start here if you're new to AI security.
Attacks
Tactical techniques used to compromise AI systems. Each entry covers mechanism, detection, and examples.
Defenses
Countermeasures, controls, and architectural patterns for securing AI systems.
The Threat Landscape in 2025
AI security threats have matured rapidly. What began as researchers demonstrating theoretical attacks has evolved into documented exploitation in production systems.
Prompt Injection
The defining vulnerability class for LLM-integrated applications. When applications pass untrusted content to language models, attackers can embed instructions that hijack model behavior. This isn't a bug that can be patched; it's an architectural challenge.
Supply Chain Attacks
Targeting AI systems through third-party models, datasets, and fine-tuning services. A compromised training dataset or backdoored model weights can persist through multiple downstream deployments.
Model Extraction
Threatens intellectual property of organizations with proprietary AI capabilities. Attackers can reconstruct model functionality through systematic querying, stealing months of training work through API access alone.
AI-Powered Attacks
Attackers now use AI systems to generate phishing content, discover vulnerabilities, and adapt attack strategies in real-time. The defender's challenge has grown exponentially.
AI Security Taxonomy Poster (PDF)
Visual reference of AI attack vectors, defense patterns, and framework mappings.
- Complete attack taxonomy visualization
- Defense pattern quick reference
- MITRE ATLAS mapping chart
- OWASP LLM Top 10 crosswalk
About the Author
This wiki is maintained by Kai Aizen, a GenAI Security Researcher specializing in adversarial AI and LLM security.
- • NVD Contributor — multiple CVE disclosures in WordPress plugins
- • Creator of the AATMF Framework
- • Developer of the P.R.O.M.P.T Framework
- • Author of Adversarial Minds
Citation
When referencing this wiki in academic papers, reports, or documentation:
Aizen, K. (2025). AI Security Wiki. snailsploit.com. Retrieved from https://snailsploit.com/ai-security/wiki/
Individual entries include specific citation formats.