The Python CLI that operationalizes AATMF. Three-layer evaluation pipeline, defense fingerprinting, decay tracking, attack-chain planning. Drops in a target endpoint, picks procedures by tactic or risk score, and emits an AATMF-R-scored report. Built for the team that wants to run AATMF as a continuous control, not as a one-off engagement.
What defenses is this endpoint actually running? Probe set runs against the target before any real attack — input filters, alignment training, output classifier, agentic boundary, rate limit. Output: a profile that determines which procedures are worth running and which would burn budget.
AATMF procedures run against the profiled endpoint. Skip procedures the L1 profile says will be no-ops; prioritize procedures the profile says will land. Each procedure's success/failure is logged, not just summarized.
Every result scored on the six AATMF-R factors (L · I · E · D · R · C). Report exports to JSON, Markdown, and a slide-ready PDF. Mappings to MITRE ATLAS, NIST AI RMF, and EU AI Act emit alongside the raw scores.
# install $ pip install aatmf-toolkit # fingerprint a target $ aatmf fingerprint --endpoint https://api.example.com/v1/chat # run a tactic $ aatmf run --tactic T11 --endpoint https://api.example.com/v1/chat \ --report report.md # run the full battery, scored $ aatmf run --all --endpoint <url> --output report.json --format aatmf-r
Identify the actual defenses on a target before attacking. Skip the procedures that will obviously fail.
Re-run a baseline against the same endpoint over time. Detect when a defense regresses — a CI signal, not a one-shot evaluation.
Compose multi-turn procedures into chains. The planner picks turn ordering that maximizes downstream landing rate.
Drop a YAML file in /procedures and it runs. Custom procedures version alongside the toolkit.
CI mode for scheduled runs. Interactive mode for triage and one-off investigation.
· MITRE ATLAS · NIST AI RMF · EU AI Act mappings emit automatically with each report.