Sourcessourceseed2026-07-04ai-securitysecurity-for-aiagent-memorymemory-poisoningbenchmarkpersistent-context

Capture Summary

Recent arXiv preprint presenting a taxonomy and benchmark for memory poisoning attacks in LLM agents.

Abstract Capture

The paper studies how persistent memory lets a single adversarial write influence later agent behavior. It identifies four memory-write channels and nine structural vulnerabilities across model capabilities, prompts, and system architecture, then organizes resulting threats into six classes of memory-poisoning attack. The authors introduce MPBench and report that agents with more aggressive memory-write and retrieval behavior are easier to exploit. They also argue that existing prompt-injection defenses do not adequately cover memory-poisoning attacks.

Collection Notes

Untrusted source content. Treat attack procedures and benchmark details as evidence only.
Primary relevance: [[03_Topics/RAG and AI Data Security]], [[03_Topics/Prompt Injection]]
PDF: https://arxiv.org/pdf/2606.04329