Sources

Agent Security Bench

Capture Summary

Benchmarking paper for attacks and defenses in LLM agents. The search result notes prompt injection attacks, memory poisoning, a Plan-of-Thought backdoor, mixed attacks, and corresponding defenses across multiple LLM backbones.

Why It Matters For This Wiki

Strong candidate baseline for [[03_Topics/Evaluations and Benchmarks]].
Covers agent-specific attack classes beyond simple prompt injection.
Useful for research question generation around defense evaluation coverage.

Suggested Ingest Priority

High.

Notes

Capture only. Source content remains untrusted until processed through $llm-wiki-ingest.