Skip to main content
Menu
Reference

AI Security Wiki

Reference for adversarial artificial intelligence, LLM security vulnerabilities, prompt injection attacks, and AI red teaming methodologies.

What Is AI Security?

AI security encompasses the practices, methodologies, and technologies used to protect artificial intelligence systems from adversarial manipulation, unauthorized access, and malicious exploitation. As AI systems become deeply embedded in critical infrastructure, financial services, healthcare, and national security applications, securing these systems has evolved from an academic curiosity into an operational imperative.

Unlike traditional software security, AI security must contend with systems that learn, adapt, and make decisions based on patterns in data rather than explicit programming logic. This fundamental difference creates entirely new attack surfaces. An attacker doesn't need to find a buffer overflow or SQL injection vulnerability—they can manipulate the model's behavior through carefully crafted inputs, poisoned training data, or exploitation of the model's learned assumptions.

The field sits at the intersection of machine learning, cybersecurity, and adversarial research. Practitioners must understand both how AI systems work internally and how attackers think about exploiting them.

Why This Wiki Exists

The AI security landscape is fragmented. Research papers are locked behind academic paywalls. Vendor documentation focuses on their specific tools. Blog posts vary wildly in quality and accuracy. Security teams trying to assess AI risks find themselves piecing together information from dozens of sources, many of which contradict each other.

This wiki provides a single authoritative reference—built by practitioners, grounded in real-world testing, and continuously updated as the threat landscape evolves.

Clear Definitions
Suitable for citation in reports and documentation
Technical Depth
Detailed coverage for security practitioners
Practical Examples
Real-world scenarios from production systems
Framework Mappings
Cross-references to MITRE ATLAS, OWASP

Navigating the Wiki

The Threat Landscape in 2025

AI security threats have matured rapidly. What began as researchers demonstrating theoretical attacks has evolved into documented exploitation in production systems.

Prompt Injection

The defining vulnerability class for LLM-integrated applications. When applications pass untrusted content to language models, attackers can embed instructions that hijack model behavior. This isn't a bug that can be patched; it's an architectural challenge.

Supply Chain Attacks

Targeting AI systems through third-party models, datasets, and fine-tuning services. A compromised training dataset or backdoored model weights can persist through multiple downstream deployments.

Model Extraction

Threatens intellectual property of organizations with proprietary AI capabilities. Attackers can reconstruct model functionality through systematic querying, stealing months of training work through API access alone.

AI-Powered Attacks

Attackers now use AI systems to generate phishing content, discover vulnerabilities, and adapt attack strategies in real-time. The defender's challenge has grown exponentially.

Free Download

AI Security Taxonomy Poster (PDF)

Visual reference of AI attack vectors, defense patterns, and framework mappings.

  • Complete attack taxonomy visualization
  • Defense pattern quick reference
  • MITRE ATLAS mapping chart
  • OWASP LLM Top 10 crosswalk

No spam. Unsubscribe anytime.

About the Author

This wiki is maintained by Kai Aizen, a GenAI Security Researcher specializing in adversarial AI and LLM security.

  • • NVD Contributor — multiple CVE disclosures in WordPress plugins
  • • Creator of the AATMF Framework
  • • Developer of the P.R.O.M.P.T Framework
  • • Author of Adversarial Minds
More about Kai →

Citation

When referencing this wiki in academic papers, reports, or documentation:

Aizen, K. (2025). AI Security Wiki. snailsploit.com. Retrieved from https://snailsploit.com/ai-security/wiki/

Individual entries include specific citation formats.