Agent Security Bench
Capture Summary
Benchmarking paper for attacks and defenses in LLM agents. The search result notes prompt injection attacks, memory poisoning, a Plan-of-Thought backdoor, mixed attacks, and corresponding defenses across multiple LLM backbones.
Why It Matters For This Wiki
- Strong candidate baseline for [[03_Topics/Evaluations and Benchmarks]].
- Covers agent-specific attack classes beyond simple prompt injection.
- Useful for research question generation around defense evaluation coverage.
Suggested Ingest Priority
High.
Notes
Capture only. Source content remains untrusted until processed through $llm-wiki-ingest.