Sourcessourceseed2026-07-04ai-securitymemory-misevolutionbenchmarkbiased-feedbacknoisy-toolslong-horizon-safety

MemEvoBench

Collection Summary

Repeated misleading information, noisy tool output, biased feedback가 persistent memory에 누적되면서 gradual behavioral drift를 일으키는 memory misevolution을 평가한다.

Rollout-Buffer Relevance

**Target store**: mixed benign and misleading memory pools across multi-round interactions.
**Buffer role**: the benchmark models a memory state shaped by prior evolution rather than a single-turn prompt.
**Security relevance**: slow poisoning, biased retention, tool-output contamination, feedback-loop drift, static-defense insufficiency.
**Affected types**: memory-augmented personal agents, workflow agents, tool-using agents, self-reflective agents.