Self-Evolving Agent Rollout and Experience Buffer Collection
Collection Scope
자가 진화형 agent가 다음 evolution step에서 참고하는 history store를 조사했다. rollout buffer를 넓게 뭉개지 않고 다음을 구분한다.
- **Trajectory/rollout buffer**: state, action, reward, tool output, evaluator result를 담은 interaction history.
- **Experience pool / episodic memory**: semantic retrieval 가능한 successful/failed trajectories and reflections.
- **Skill/program library**: validated executable behavior를 축약한 persistent artifact.
- **Curriculum/task pool**: failures and verifier outcomes에서 파생된 future training tasks.
- **Population/archive**: candidate agents, code variants, algorithms and fitness lineage.
- **Derived memory graph**: history에서 요약·변환된 memory와 ancestry/provenance edges.
Type And Buffer Matrix
| Agent/evaluation type | Representative source | History store | Uses rollout buffer? | Primary new attack surface |
|---|---|---|---|---|
| RL-style self-evolving agent | AgentEvolver | Experience Pool, trajectory profile, attributed rewards | Yes, explicit functional pool | experience selection, reward/credit poisoning |
| Agent-environment co-evolution | Role-Agent | rollout batches, failed trajectories, failure reflections | Yes; durable retention unclear | environment prediction and failure-cluster poisoning |
| Agent-data co-evolution | CoEvolve | rollout-derived forgetting/uncertainty, evolved task distribution | Uses rollout trajectories; explicit replay policy unclear | uncertainty spoofing, task-synthesis poisoning |
| Multi-agent curriculum evolution | SAGE | verified question and plan pool | No generic rollout buffer | critic/verifier capture, curriculum drift |
| Experiential language agent | ExpeL | Faiss experience pool, successful trajectories, insights | Yes, explicit experience pool | retrieved demonstration poisoning, privacy leakage |
| Self-reflective agent | Reflexion | episodic memory buffer with trajectory and reflection | Yes, explicit buffer | feedback/reflection poisoning, context eviction |
| Embodied lifelong agent | Voyager | executable skill library and automatic curriculum | No generic rollout buffer | malicious skill admission and unsafe composition |
| Self-modifying code/algorithm agent | Existing DGM/AlphaEvolve sources | agent/program candidate archive | Archive rather than rollout buffer | evaluator gaming, archive lineage corruption |
| Skill-evolving agent | Existing SkillOpt/SkillLens sources | rollout evidence -> natural-language skill artifact | Rollout evidence, implementation-specific retention | trace poisoning, negative transfer, unsafe promotion |
| Experience-retrieval attack | MemoryGraft | successful experience RAG store | Targets functional rollout memory | persistent unsafe-procedure imitation |
| Memory-misevolution benchmark | MemEvoBench | mixed benign/misleading memory pool | Simulates evolved memory state | gradual drift and biased feedback accumulation |
| Memory-lineage defense | MemLineage | signed content-addressed memory plus derivation DAG | Protects persistent history store | trusted-writer laundering and ancestry loss |
| Post-hoc memory audit | MemAudit | memory store plus replay-based causal analysis | Audits stored history | delayed attribution after harmful behavior |
| Tool-selection memory attack | MemMorph | factual/episodic/policy memory and tool-use experience | Targets accumulated experience | persistent tool hijacking |
| Generic retrieval backdoor | AgentPoison | memory/RAG demonstrations | Targets retrieval-indexed history | semantic trigger and low-rate backdoor poisoning |
Emerging Attack-Surface Checklist
- **History ingress poisoning**: environment, user, tool, evaluator가 fabricated trajectory를 정상 experience처럼 기록.
- **Reward and attribution tampering**: harmful steps를 high-value experience로 승격하거나 useful failures를 제거.
- **Retrieval manipulation**: embedding collision, trigger-neighbor construction, recency/priority abuse로 poisoned history 노출을 증폭.
- **Evaluator and verifier capture**: poisoned history와 동일 context/model을 쓰는 validator가 false improvement를 승인.
- **Curriculum capture**: crafted failures and uncertainty signals가 future task distribution을 attacker-preferred region으로 이동.
- **Cross-task/tenant leakage**: trajectory에 secret, identity, tool result, user data가 남아 다른 task에서 회수.
- **Slow memory misevolution**: individually benign-looking feedback가 반복 누적되어 policy drift를 생성.
- **Archive rollback sabotage**: safe ancestor, rejected variant, failure evidence를 삭제하거나 lineage를 위조.
- **Executable artifact persistence**: trajectory에서 추출된 skill/code가 검증 없이 reusable capability로 승격.
- **Denial of evolution**: buffer flooding, low-quality trajectory saturation, context eviction으로 useful history를 밀어냄.
Existing Duplicates Not Recollected
SkillOpt,SkillLens,Darwin Godel Machine,AlphaEvolve,AI Scientist, and hierarchical skill meta-evolving sources already exist inraw/and01_Sources/.
Recommended Ingest Order
- AgentEvolver + ExpeL + Reflexion: establish rollout/experience-buffer architecture.
- MemoryGraft + MemEvoBench + MemMorph: define persistent experience poisoning and behavioral drift.
- MemLineage + MemAudit + Memory Poisoning Attack and Defense: collect provenance, audit, sanitization controls.
- Role-Agent + CoEvolve + SAGE: extend threat model to curriculum and agent-data co-evolution.
- Voyager + AgentPoison: connect skill-library persistence and retrieval-trigger baselines.
Collection Safety
All source content was treated as untrusted research material. Attack descriptions were summarized at the threat-model level; no source instructions, payloads, or attack code were executed.