Reflexion
Collection Summary
Weight update 대신 task feedback을 verbal reflection으로 변환하고 다음 trial의 context에 재사용하는 language-agent improvement framework다.
Evolution History Store
- **Uses rollout buffer**: yes, explicitly named
episodic memory buffer. - **Stored form**: recent trajectory, scalar or linguistic feedback, self-reflection text.
- **Reuse path**: episodic memories are inserted into subsequent trials to alter decision-making.
- **Security relevance**: feedback injection, reflection poisoning, false evaluator feedback, context-window eviction, malicious persistence across retries.