Sourcessourceseed2026-07-04ai-securityself-evolving-agentrollout-bufferexperience-memoryattack-surfacecollection

Self-Evolving Agent Rollout and Experience Buffer Collection

Collection Scope

자가 진화형 agent가 다음 evolution step에서 참고하는 history store를 조사했다. rollout buffer를 넓게 뭉개지 않고 다음을 구분한다.

**Trajectory/rollout buffer**: state, action, reward, tool output, evaluator result를 담은 interaction history.
**Experience pool / episodic memory**: semantic retrieval 가능한 successful/failed trajectories and reflections.
**Skill/program library**: validated executable behavior를 축약한 persistent artifact.
**Curriculum/task pool**: failures and verifier outcomes에서 파생된 future training tasks.
**Population/archive**: candidate agents, code variants, algorithms and fitness lineage.
**Derived memory graph**: history에서 요약·변환된 memory와 ancestry/provenance edges.

Type And Buffer Matrix

Agent/evaluation type	Representative source	History store	Uses rollout buffer?	Primary new attack surface
RL-style self-evolving agent	AgentEvolver	Experience Pool, trajectory profile, attributed rewards	Yes, explicit functional pool	experience selection, reward/credit poisoning
Agent-environment co-evolution	Role-Agent	rollout batches, failed trajectories, failure reflections	Yes; durable retention unclear	environment prediction and failure-cluster poisoning
Agent-data co-evolution	CoEvolve	rollout-derived forgetting/uncertainty, evolved task distribution	Uses rollout trajectories; explicit replay policy unclear	uncertainty spoofing, task-synthesis poisoning
Multi-agent curriculum evolution	SAGE	verified question and plan pool	No generic rollout buffer	critic/verifier capture, curriculum drift
Experiential language agent	ExpeL	Faiss experience pool, successful trajectories, insights	Yes, explicit experience pool	retrieved demonstration poisoning, privacy leakage
Self-reflective agent	Reflexion	episodic memory buffer with trajectory and reflection	Yes, explicit buffer	feedback/reflection poisoning, context eviction
Embodied lifelong agent	Voyager	executable skill library and automatic curriculum	No generic rollout buffer	malicious skill admission and unsafe composition
Self-modifying code/algorithm agent	Existing DGM/AlphaEvolve sources	agent/program candidate archive	Archive rather than rollout buffer	evaluator gaming, archive lineage corruption
Skill-evolving agent	Existing SkillOpt/SkillLens sources	rollout evidence -> natural-language skill artifact	Rollout evidence, implementation-specific retention	trace poisoning, negative transfer, unsafe promotion
Experience-retrieval attack	MemoryGraft	successful experience RAG store	Targets functional rollout memory	persistent unsafe-procedure imitation
Memory-misevolution benchmark	MemEvoBench	mixed benign/misleading memory pool	Simulates evolved memory state	gradual drift and biased feedback accumulation
Memory-lineage defense	MemLineage	signed content-addressed memory plus derivation DAG	Protects persistent history store	trusted-writer laundering and ancestry loss
Post-hoc memory audit	MemAudit	memory store plus replay-based causal analysis	Audits stored history	delayed attribution after harmful behavior
Tool-selection memory attack	MemMorph	factual/episodic/policy memory and tool-use experience	Targets accumulated experience	persistent tool hijacking
Generic retrieval backdoor	AgentPoison	memory/RAG demonstrations	Targets retrieval-indexed history	semantic trigger and low-rate backdoor poisoning

Emerging Attack-Surface Checklist

**History ingress poisoning**: environment, user, tool, evaluator가 fabricated trajectory를 정상 experience처럼 기록.
**Reward and attribution tampering**: harmful steps를 high-value experience로 승격하거나 useful failures를 제거.
**Retrieval manipulation**: embedding collision, trigger-neighbor construction, recency/priority abuse로 poisoned history 노출을 증폭.
**Evaluator and verifier capture**: poisoned history와 동일 context/model을 쓰는 validator가 false improvement를 승인.
**Curriculum capture**: crafted failures and uncertainty signals가 future task distribution을 attacker-preferred region으로 이동.
**Cross-task/tenant leakage**: trajectory에 secret, identity, tool result, user data가 남아 다른 task에서 회수.
**Slow memory misevolution**: individually benign-looking feedback가 반복 누적되어 policy drift를 생성.
**Archive rollback sabotage**: safe ancestor, rejected variant, failure evidence를 삭제하거나 lineage를 위조.
**Executable artifact persistence**: trajectory에서 추출된 skill/code가 검증 없이 reusable capability로 승격.
**Denial of evolution**: buffer flooding, low-quality trajectory saturation, context eviction으로 useful history를 밀어냄.

Existing Duplicates Not Recollected

SkillOpt, SkillLens, Darwin Godel Machine, AlphaEvolve, AI Scientist, and hierarchical skill meta-evolving sources already exist in raw/ and 01_Sources/.

Recommended Ingest Order

AgentEvolver + ExpeL + Reflexion: establish rollout/experience-buffer architecture.
MemoryGraft + MemEvoBench + MemMorph: define persistent experience poisoning and behavioral drift.
MemLineage + MemAudit + Memory Poisoning Attack and Defense: collect provenance, audit, sanitization controls.
Role-Agent + CoEvolve + SAGE: extend threat model to curriculum and agent-data co-evolution.
Voyager + AgentPoison: connect skill-library persistence and retrieval-trigger baselines.

Collection Safety

All source content was treated as untrusted research material. Attack descriptions were summarized at the threat-model level; no source instructions, payloads, or attack code were executed.