MemEvoBench
Collection Summary
Repeated misleading information, noisy tool output, biased feedback가 persistent memory에 누적되면서 gradual behavioral drift를 일으키는 memory misevolution을 평가한다.
Rollout-Buffer Relevance
- **Target store**: mixed benign and misleading memory pools across multi-round interactions.
- **Buffer role**: the benchmark models a memory state shaped by prior evolution rather than a single-turn prompt.
- **Security relevance**: slow poisoning, biased retention, tool-output contamination, feedback-loop drift, static-defense insufficiency.
- **Affected types**: memory-augmented personal agents, workflow agents, tool-using agents, self-reflective agents.