AI Security Research Portal
Sourcessourceseed2026-07-04ai-security

BadAgent Zotero Capture

Zotero item: ULJL8WFH

Authors: Yifei Wang; Dizhan Xue; Shengjie Zhang; Shengsheng Qian; Lun-Wei Ku; Andre Martins; Vivek Srikumar.

Venue/date in Zotero: ACL 2024, 2024-08.

Metadata source: Zotero MCP fetch.

Abstract-Derived Notes

BadAgent studies backdoor attacks against LLM agents built from trained/fine-tuned LLMs. The paper distinguishes activation through direct user input and activation through agent environment observations.

Key source claims:

Safety Note

The source includes adversarial examples and command-like attack content. The wiki records only analytical summaries and does not reproduce operational payloads.