AI Security Research Portal
Sources

AgentPoison

Collection Summary

Long-term memory or RAG knowledge base에 malicious demonstrations와 embedding-retrieval trigger를 심어 특정 user instruction에서 backdoor behavior를 유도하는 red-team research다.

Rollout-Buffer Relevance