Raw Papers Batch Ingest
This batch source note catalogs 185 raw papers markdown sources currently present under vault/01-Raw-Sources/papers/. The raw files remain immutable; this note is the durable ingest handle used by wiki pages, claims, and research questions.
Batch Analysis
- Source count: 185
- Dominant operation: batch ingest from raw captures into wiki synthesis.
- Evidence quality: mixed. Treat papers and standards as stronger evidence than news and vendor reports; treat collection manifests as routing aids, not direct evidence.
- Citation status: raw paper captures include bibliographic metadata when available, but canonical DOI/arXiv verification should be refreshed for sources promoted into claims or reports.
Paper Lookup Verification
The batch ingest used existing raw capture metadata to identify titles, source kinds, publication dates, tags, URLs, and summaries. Canonical DOI/arXiv/OpenAlex/Semantic Scholar verification should be refreshed with paper-lookup for any individual paper that is promoted into a stable claim, report, or formal literature review.
Citation Record
This note is a batch-level citation handle for the current raw paper corpus. Individual paper citations remain in the raw capture files and should be normalized with citation-management before publication-grade reuse.
Evidence Quality
The paper corpus is strong for topic discovery and benchmark mapping, but mixed for claim stabilization. Benchmark papers, surveys, and peer-reviewed work should be separated from collection manifests, preliminary preprints, and unverified capture notes before assigning high confidence to claims.
Scholar Evaluation
At batch level, the corpus shows broad coverage of agent security, prompt injection, memory poisoning, RAG security, MCP/protocol security, AI SOC, and benchmark design. Source-specific scholar evaluation remains required before treating a paper as decisive evidence.
Key Claims
- Agentic AI security extends beyond prompt safety into tools, memory, protocols, authorization, and runtime monitoring.
- Prompt injection, memory poisoning, RAG poisoning, and tool poisoning recur as dominant attack families.
- Security benchmarks are abundant, but their deployment validity remains an unresolved research question.
Contradictions Or Caveats
- Many papers propose defenses, but the source captures do not yet establish cross-model or production generalization.
- Some benchmark claims may be sensitive to task design, model choice, contamination, and scoring methodology.
- Collection manifests are routing aids rather than direct evidence.
Follow-Up Hypotheses
- Defense methods that bind authority to memory provenance may reduce persistent-agent poisoning without destroying utility.
- Runtime gateway controls may work best when paired with action-scoped identity and audit evidence.
- Benchmarks that include realistic tools, secrets, logs, and failure recovery may better predict deployment risk.
Candidate Research Questions
- RQ-20260702-004-agent-protocol-security
- RQ-20260702-005-memory-poisoning-defense
- RQ-20260702-006-benchmark-to-incident-validity
- RQ-20260702-007-action-scoped-authorization
- RQ-20260702-008-rag-poisoning-controls
- RQ-20260702-009-ai-soc-human-factors
- RQ-20260702-010-agent-runtime-monitoring
Source Catalog
| Title | Kind | Date | Tags | Raw |
|---|---|---|---|---|
| Prompt Injection in Automated Résumé Screening with Large Language Models: Single and Multi-Injectio | paper | 2026-06-25 | Jane Yi Jiang, Jiannan Xu, Preet Baxi, Stefanus Jasin, decision-integrity, hiring-workflow | raw |
| AI Security Paper Collection 2026-06-25 | collection_manifest | 2026-06-25 | agent-security, ai-for-security, ai-security, collection-manifest, jailbreak, papers | raw |
| What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamics | paper | 2026-06-24 | ai-security, detection, guardrails, jailbreak, mechanistic-interpretability | raw |
| Tracing Target Answers in Poisoned Retrieval Corpora via Token Influence Attribution | paper | 2026-06-24 | ai-security, attribution, provenance, rag, retrieval-poisoning | raw |
| Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Future | paper | 2026-06-24 | ai-security, defense, privacy, rag, survey, threat-model | raw |
| How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scorin | paper | 2026-06-24 | ai-security, asr, calibration, evaluation, jailbreak, llm-judge | raw |
| AI Snitches Get Glitches: Towards Evading Agentic Surveillance | paper | 2026-06-24 | adversarial, agentic-ai, ai-security, monitoring, surveillance-evasion | raw |
| Securing LLM-Agent Long-Term Memory Against Poisoning: Non-Malleable, Origin-Bound Authority with Ma | paper | 2026-06-23 | agent-memory, ai-security, formal-methods, memory-poisoning, provenance | raw |
| REALM: A Unified Red-Teaming Benchmark for Physical-World VLMs | paper | 2026-06-23 | ai-security, benchmark, multimodal-security, red-teaming, vlm | raw |
| Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Comprom | paper | 2026-06-23 | ai-security, multi-agent-systems, privacy, rag, semantic-rewriting | raw |
| Poisoned Playbooks: Demystifying Knowledge Poisoning Effects on AI Security Agents | paper | 2026-06-23 | ai-for-security, ai-security, knowledge-poisoning, playbooks, soc-agent | raw |
| PixJail: Self-Evolving Paper-to-Pipeline Reproduction for Text-to-Image Jailbreak Evaluation | paper | 2026-06-23 | ai-security, evaluation, jailbreak, self-evolving, text-to-image | raw |
| When AUC 0.998 Is Not Enough: A Candidate Evaluation Protocol for Hidden-State Probes of Indirect Pr | paper | 2026-06-22 | ai-security, computer-use-agent, evaluation, hidden-state-probes, indirect-prompt-injection, multimodal-agent | raw |
| Self-Evolving Agent Rollout and Experience Buffer Collection | collection_manifest | 2026-06-22 | attack-surface, collection, experience-memory, rollout-buffer, self-evolving-agent | raw |
| RAVEN: Agentic RAG for Automated Vulnerability Repair | paper | 2026-06-22 | agentic-rag, ai-for-security, software-security, vulnerability-repair | raw |
| GIF: Locally Sound Geometric Information Flow Control for LLMs | paper | 2026-06-22 | ai-security, data-leakage, formal-methods, information-flow-control, llm | raw |
| Detecting Malicious Agent Skills in the Wild using Attention | paper | 2026-06-22 | agent-skills, ai-security, detection, malicious-skills, supply-chain | raw |
| Confidently Wrong: Severity-Aware Calibration of Prompt-Injection Detectors under Attack Shift | paper | 2026-06-22 | ai-security, detector-calibration, evaluation, prompt-injection, robustness | raw |
| AgentLens: Interpretable Safety Steering via Mechanistic Subspaces for Multi-Turn Coding Agent | paper | 2026-06-22 | ai-security, coding-agent, interpretability, multi-turn-agent, safety-steering | raw |
| Safe to Check, Unsafe to Use: Relinking at the Compression Boundary of LLM Agents | paper | 2026-06-20 | agent-memory, ai-security, context-compression, llm-agent, prompt-injection | raw |
| AgentRiskBOM: A Risk-Scoping Security Bill of Materials for Agentic AI Systems | paper | 2026-06-20 | agentic-ai, ai-security, governance, risk-management, sbom | raw |
| Agent-Assisted Side-Channel Attacks on Non-Prefix KV Cache in RAG | paper | 2026-06-20 | agent-assisted-attack, ai-security, kv-cache, rag, side-channel | raw |
| \"What Happens Locally, Leaks Globally\": Detecting Privacy Leakage Risks in MCP Servers | paper | 2026-06-19 | agent-security, ai-security, mcp, privacy-leakage, static-analysis, tool-security | raw |
| Scalable Hierarchical Attention Transformers for Multi-Turn Jailbreak Detection in Long Conversation | paper | 2026-06-19 | ai-security, detection, jailbreak, long-context, multi-turn | raw |
| OTTER: A Red-Teaming System for Toxicity-Evading Jailbreak Prompt Optimization | paper | 2026-06-19 | ai-security, jailbreak, prompt-optimization, red-teaming, safety-evaluation | raw |
| Honeyquest for LLMs: Rethinking Cyber Deception for AI Attackers | paper | 2026-06-19 | ai-for-security, cyber-deception, honeypot, llm-attackers, threat-intelligence | raw |
| AgenticOS: An Intent-Oriented Secure Operating System Architecture for Autonomous AI Agents | paper | 2026-06-19 | agentic-ai, ai-security, intent, runtime-security, secure-os | raw |
| Efficient and Sound Probabilistic Verification for AI Agents | paper | 2026-06-18 | Alaia Solko-Breslin, Krishnamurthy Dvijotham, Mihai Christodorescu, Pramod Kaushik Mudrakarta, Somesh Jha, datalog | raw |
| A Layered Security Framework Against Prompt Injection in RAG-Based Chatbots | paper | 2026-06-18 | ai-security, chatbot-security, defense, prompt-injection, rag | raw |
| Ghost Vectors: Soft-Deleted Embeddings Remain Reconstructible in HNSW Vector Databases | paper | 2026-06-17 | ai-security, data-deletion, embeddings, privacy, rag, vector-database | raw |
| From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperabilit | paper | 2026-06-17 | a2a, ai-security, mcp, metadata-leakage, multi-agent-systems, workflow-integrity | raw |
| Code-Augur: Agentic Vulnerability Detection via Specification Inference | paper | 2026-06-17 | agentic-ai, ai-for-security, software-security, specification-inference, vulnerability-detection | raw |
| SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agents | paper | 2026-06-16 | Chao Xu, Hanting Chen, Haocheng Mei, Mengyu Zheng, Xinghao Chen, Ye Yuan | raw |
| Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systems | paper | 2026-06-16 | ai-security, knowledge-injection, poisoning, rag, retriever-editing | raw |
| An Evaluation of Data Leakage Risks in Tool-Using LLM Agents in Realistic Scenarios | paper | 2026-06-15 | agent-security, ai-security, data-leakage, privacy, tool-using-agent | raw |
| TrustedARI: Towards Trust-Native Agentic Routing Infrastructure for Agentic AI | paper | 2026-06-14 | agentic-ai, ai-security, multi-agent-systems, routing, trust-infrastructure | raw |
| One Goal, Many Commands: Characterizing Denylist Fragility in AI Agents | paper | 2026-06-14 | agent-security, ai-security, denylist, policy-enforcement, tool-use | raw |
| Let Them Steal: Trapping Large Language Model Extraction Attacks with Knowledge Honeypot | paper | 2026-06-14 | ai-security, defense, honeypot, llm, model-extraction | raw |
| AIChilles: Automatically Uncovering Hidden Weaknesses in AI-Evolved Systems | paper | 2026-06-14 | ai-evolved-systems, ai-security, self-evolving-ai, testing, weakness-discovery | raw |
| Same-Origin Policy for Agentic Browsers | paper | 2026-06-12 | agentic-browser, ai-security, prompt-injection, same-origin-policy, web-security | raw |
| Game-Theoretic Multi-Agent Control for Robust Contextual Reasoning in LLMs | paper | 2026-06-12 | ai-security, context-poisoning, mcp, multi-agent-control, prompt-injection, rollback | raw |
| AgentCyberRange: Benchmarking Frontier AI Systems in Realistic Cyber Ranges | paper | 2026-06-12 | ai-for-security, benchmark, cyber-capability, cyber-range, frontier-ai | raw |
| SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systems | paper | 2026-06-11 | agent-memory, ai-security, certified-defense, memory-poisoning, persistent-agents | raw |
| PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injections | paper | 2026-06-10 | Ash Fox, George Lee, Jiliang Tang, Lesly Miculicich, Long T. Le, Pengfei He | raw |
| Influence Factors on RAG Poisoning | paper | 2026-06-10 | ai-security, evaluation, poisoning, rag, retrieval | raw |
| Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Production | paper | 2026-06-10 | agent-methodology, agentic-ai, ai-technology, audit-trail, custom-agents, security-boundaries | raw |
| When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipeline | paper | 2026-06-09 | ai-security, chunking, corpus-poisoning, rag, reranking, retrieval | raw |
| Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolution | paper | 2026-06-09 | co-evolution, curriculum, failure-history, rollout-trajectories, self-evolving-agent | raw |
| Assessing Automated Prompt Injection Attacks in Agentic Environments | paper | 2026-06-09 | David Hofer, Edoardo Debenedetti, Florian Tram챔r, agent-security, agentdojo, benchmark | raw |
| AgentCanary: A Security Evaluation Framework for Autonomous AI Agents in Real Executable Environment | paper | 2026-06-09 | agent-security, ai-security, benchmark, evaluation, executable-environment | raw |
| The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Context | paper | 2026-06-08 | ai-security, context-injection, prompt-injection, rag, recommendation | raw |
| Document-Authored Control-Signal Impersonation: A Low-Cost Indirect Prompt Attack on RAG Safety Boun | paper | 2026-06-08 | ai-security, control-signal, indirect-prompt-injection, rag, safety-boundary | raw |
| RAILS: Verification-Native Clearing For Agentic Commerce | paper | 2026-06-07 | agent-integrity, agentic-commerce, ai-security, non-human-identity, settlement-risk, verification | raw |
| GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelines | paper | 2026-06-07 | Ilia Shumailov, Jafar Isbarov, Murat Kantarcioglu, Umid Suleymanov, benchmark, ci-cd | raw |
| Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systems | paper | 2026-06-07 | Gao Cong, Guoliang Li, Haoyang Li, Kuncan Wang, Peizhuo Lv, Wei Dong | raw |
| OpenAgenet / OAN Yellow Paper: Technical Architecture for Trust-Governed Resource Identity and Disco | paper | 2026-06-05 | a2a, agent-identity, ai-security, mcp, resource-discovery, skills | raw |
| ZERO-APT: A Closed-Loop Adversarial Framework for LLM-Driven Automated Penetration Testing under Int | paper | 2026-06-04 | Anlan Zheng, Tiantian Zhu, ai-for-security, auditability, cyber-benchmark, defender-in-the-loop | raw |
| What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Syst | paper | 2026-06-03 | Liya Su, Suchen Liu, Tianyun Liu, Tingwen Liu, Yingjie Zhang, Yuanbo Xie | raw |
| NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewriting | preprint | 2026-06-03 | ai-for-security, ai-soc, anomaly-detection, explainability, log-analysis | raw |
| From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agents | paper | 2026-06-03 | Aditi Jain, Pritam Dash, Tanmay Shah, Tongyu Ge, Zhiwei Shang, agent-memory | raw |
| CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilities | paper | 2026-06-03 | Alexander Cheung, Chenguang Wang, Dawn Song, Dongwei Jiang, Francisco De La Riega, Gabriel Han | raw |
| AI Model Extraction Attacks: Bypassing Single-Client Assumptions in Defenses | paper | 2026-06-02 | Gustavo S찼nchez, Johannes F. Loevenich, Laurin Holz, Maxime Schwarzer, Roberto Rigolin F. Lopes, Thies M철hlenhof | raw |
| An Embarrassingly Simple Detector for Model Extraction Attacks in LLM APIs | paper | 2026-06 | detection, latest-research, llm-api-security, model-extraction, security-for-ai | raw |
| SS-ZKR: Spatial-Semantic Zero-Knowledge Routing for Privacy-Preserving Multi-Agent Collaboration | paper | 2026-05-31 | a2a, ai-security, mcp, multi-agent-systems, privacy, routing | raw |
| Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystems | paper | 2026-05-30 | Ismail Hossain, Nan Jiang, Sai Puppala, Sajedul Talukder, Zhuoran Lu, agent-skills | raw |
| Poison with Style: A Practical Poisoning Attack on Code Large Language Models | paper | 2026-05-26 | Issa Khalil, Khang Tran, Md Rizwan Parvez, NhatHai Phan, Ting Yu, Yazan Boshmaf | raw |
| MemMorph: Tool Hijacking in LLM Agents via Memory Poisoning | paper | 2026-05-24 | accumulated-experience, memory-poisoning, persistent-state, tool-hijacking, tool-selection | raw |
| SkillOpt: Executive Strategy for Self-Evolving Agent Skills | paper | 2026-05-22 | agent-skills, microsoft, self-evolving-agents, skillopt, text-space-optimization | raw |
| MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly D | paper | 2026-05-22 | anomaly-detection, causal-attribution, counterfactual-replay, memory-audit, poisoning | raw |
| From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skills | paper | 2026-05-22 | agent-skills, microsoft, model-generated-skills, self-evolving-agents, skilllens | raw |
| Self-Evolving Multi-Agent Systems via Decentralized Memory | paper | 2026-05-21 | decentralized-memory, llm-as-a-judge, mas-misevolution-propagation, multi-agent, persistent-memory, self-evolving-agents | raw |
| LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injection | paper | 2026-05-18 | benchmark, executable-harm, indirect-prompt-injection, security-for-ai, virtual-machine | raw |
| AI Agents May Always Fall for Prompt Injections | paper | 2026-05-17 | contextual-integrity, defense-limitations, information-flow, prompt-injection, security-for-ai | raw |
| State Contamination in Memory-Augmented LLM Agents | paper | 2026-05-16 | mas-misevolution-propagation, memory-laundering, memory-poisoning, multi-agent-rollouts, persistent-state, state-contamination | raw |
| MemLineage: Lineage-Guided Enforcement for LLM Agent Memory | paper | 2026-05-14 | derivation-dag, memory-lineage, merkle-log, provenance, sensitive-action-gate | raw |
| DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agen | paper | 2026-05-06 | agent-security, benchmark, prompt-injection, red-teaming, security-for-ai, skill-injection | raw |
| Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructure | paper | 2026-05-06 | authorization, delegation, identity-governance, multi-agent-systems, security-for-ai | raw |
| Defending LLM Agents Against Context-Aware Prompt Injection | paper | 2026-05-05 | agents, ai-security, context-aware-attacks, defenses, prompt-injection | raw |
| You Live More Than Once: Towards Hierarchical Skill Meta-Evolving | paper | 2026-05 | agent-skills, meta-evolving, self-evolving-agents, skill-evolving, test-time-learning | raw |
| Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, a | paper | 2026-04-30 | Akramul Azim, Md Hasan Saju, ai-for-security, ai-soc, autonomous-soc, query-generation | raw |
| OpenSOC-AI: Democratizing Security Operations with Parameter Efficient LLM Log Analysis | preprint | 2026-04-29 | ai-for-security, ai-soc, llm, log-analysis, lora, smbs | raw |
| AgentSOC: A Multi-Layer Agentic AI Framework for Security Operations Automation | paper | 2026-04-22 | Joyjit Roy, Samaresh Kumar Singh, agentic-soc, ai-for-security, ai-soc, incident-response | raw |
| MemEvoBench: Benchmarking Safety Risks from Memory Misevolution in LLM Agents | paper | 2026-04-17 | benchmark, biased-feedback, long-horizon-safety, memory-misevolution, noisy-tools | raw |
| CoEvolve: Training LLM Agents via Agent-Data Mutual Evolution | paper | 2026-04-17 | agent-data-coevolution, forgetting, rollout-trajectories, self-evolving-agent, uncertainty | raw |
| SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agents | paper | 2026-04-13 | Bonan Zheng, Cristian Leo, Daniel Begimher, Jack Huang, Pat Gaw, ai-for-security | raw |
| Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in C | paper | 2026-04-11 | Aditi Ganapathi, Chih-Yi Huang, Gail-Joon Ahn, Jaron Mink, Kashyap Thimmaraju, Souradip Nath | raw |
| LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operations | preprint | 2026-04-07 | agentic-ai, ai-for-security, ai-soc, governance, human-in-the-loop, mcp | raw |
| Security risk management in the digital enterprise: enhancing cyber defense with large language mode | journal_paper | 2026-04-06 | Abdulrahman Alojail, Samir A. E. Kahouf, Shaymaa Sorour, Shorouk El-Deep, ai-for-security, ai-soc | raw |
| Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool P | paper | 2026-03-23 | Amin Milani Fard, Charoes Huang, Ngoc Phu Tran, Xin Huang, mcp-security, security-for-ai | raw |
| Memory poisoning and secure multi-agent systems | paper | 2026-03-20 | episodic-memory, mas-misevolution-propagation, memory-poisoning, multi-agent, secure-mas, semantic-memory | raw |
| Retrieval-Augmented LLMs for Security Incident Analysis | paper | 2026-03-18 | Aditya Vikram Singh, Alex Fitts, Alina Oprea, Dirk Van Bruggen, Edward Koh, Harsh Mamania | raw |
| SAGE: Multi-Agent Self-Evolution for LLM Reasoning | paper | 2026-03-16 | critic, curriculum-pool, multi-agent, self-evolving-agent, verifier | raw |
| How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Compe | paper | 2026-03-16 | benchmark, computer-use, concealment, indirect-prompt-injection, red-teaming, security-for-ai | raw |
| Before You Hand Over the Wheel: Evaluating LLMs for Security Incident Analysis | paper | 2026-03-06 | Adrian Taylor, Grant Vandenberghe, Madeena Sultana, Sourov Jajodia, Suryadipta Majumdar, agentic-evaluation | raw |
| From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaboration | paper | 2026-03-04 | error-cascade, genealogy-graph, llm-mas, mas-misevolution-propagation, multi-agent, propagation | raw |
| ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilities | paper | 2026-03 | ai-for-security, benchmark, llm-agents, vulnerability-patching, zero-day | raw |
| The Attack and Defense Landscape of Agentic AI | paper | 2026-03 | agent-security, attack-landscape, defense-landscape, open-challenges, security-for-ai | raw |
| Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenarios | paper | 2026-03 | ai-for-security, benchmark, cyber-range, llm-agents, multi-step-attack, risk-evaluation | raw |
| From Secure Agentic AI to Secure Agentic Web: Challenges, Threats, and Future Directions | paper | 2026-03 | agent-security, agentic-web, open-challenges, security-for-ai | raw |
| PAIEL: Protocol-Aware and Context-Integrated Protocol Explanation Using LLMs for SOCs | conference_paper | 2026-02-23 | ai-for-security, ai-soc, context-compression, protocol-analysis, rag, structured-context | raw |
| Non-Disruptive Disruption: An Empirical Experience of Introducing LLMs in the SOC | conference_paper | 2026-02-23 | ai-for-security, ai-soc, co-creation, ethnography, human-ai-collaboration | raw |
| Cognitive Threat Detection for SOC Operations: Automating Manipulation Tactic Analysis in Election S | conference_paper | 2026-02-23 | ai-for-security, ai-soc, cognitive-threat, election-security, llm-routing | raw |
| SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory P | paper | 2026-02-17 | architectural-isolation, bayesian-trust, mas-misevolution-propagation, mcp, memory-poisoning, multi-agent-memory | raw |
| Memory Poisoning Propagation and Repair Mechanism in Multi-Agent Collaborative Environments | paper | 2026-02-14 | contrastive-learning, evidence-graph, mas-misevolution-propagation, memory-poisoning, multi-agent, propagation | raw |
| AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments? | paper | 2026-02-03 | agent-security, benchmark, dynamic-tasks, prompt-injection, security-for-ai | raw |
| SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Tasks | paper | 2026-02 | agent-skills, benchmark, self-evolving-agents, self-generated-skills, skillsbench | raw |
| Security Threat Modeling for Emerging AI-Agent Protocols | paper | 2026-02 | agent-protocols, mcp-security, multi-agent, security-for-ai, threat-modeling | raw |
| DARPA's AI Cyber Challenge (AIxCC): Competition Design, Results, and Lessons | paper | 2026-02 | ai-for-security, aixcc, competition, cyber-reasoning-system, patching, vulnerability-discovery | raw |
| Memory Poisoning Attack and Defense on Memory Based LLM-Agents | paper | 2026-01-09 | memory-poisoning, memory-sanitization, query-only-attack, temporal-decay, trust-aware-retrieval | raw |
| Prompt Injection Attacks on Agentic Coding Assistants | paper | 2026-01 | agent-security, coding-agents, prompt-injection, security-for-ai, software-supply-chain | raw |
| Experiences of Using Agentic AI to Fill Tooling Gaps in a Security Operations Center | paper | 2026 | Faayed Al Faisal, Kritan Banstola, Xinming Ou, ai-agent, ai-for-security, alert-triage | raw |
| MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrieval | paper | 2025-12-18 | experience-retrieval, memory-poisoning, persistent-compromise, rollout-buffer-security | raw |
| AgentEvolver: Towards Efficient Self-Evolving Agent System | paper | 2025-11-13 | experience-pool, reinforcement-learning, rollout-buffer, self-evolving-agent, trajectory-attribution | raw |
| Whisper Leak: a side-channel attack on Large Language Models | paper | 2025-11-05 | Geoff McDonald, Jonathan Bar Or, llm-traffic-analysis, model-security, privacy, security-for-ai | raw |
| Securing AI Agent Execution | paper | 2025-10-24 | Christoph Bühler, Guido Salvaneschi, Luca Di Grazia, Matteo Biagiola, access-control, agent-security | raw |
| Carbon Filter: Scalable, Efficient, and Secure Alert Triage for Endpoint Detection & Response | conference_paper | 2025-10-20 | Adam Bates, Jonathan Oliver, Muhammad Adil Inam, Raghav Batta, ai-for-security, ai-soc | raw |
| Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samples | paper | 2025-10 | backdoors, data-poisoning, model-security, security-for-ai, training-data-security | raw |
| CORTEX: Collaborative LLM Agents for High-Stakes Alert Triage | paper | 2025-09-30 | Bowen Wei, Chris Jordan, Howard Liu, Jinhao Pan, Kun Luo, Yuan Shen Tay | raw |
| Large Language Models for Security Operations Centers: A Comprehensive Survey | paper | 2025-09-13 | Ali Habibzadeh, Farid Feyzi, Reza Ebrahimi Atani, ai-for-security, ai-soc, llm | raw |
| MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers | paper | 2025-08-19 | Guanquan Shi, Haifeng Sun, Haohua Du, Haoran Cheng, Suyuan Liu, Xiangyang Li | raw |
| Systematic Analysis of MCP Security | paper | 2025-08-18 | Peng Di, Puzhuo Liu, Sheng Wen, Wanlun Ma, Xi Xiao, Xiaogang Zhu | raw |
| Integrating Large Language Models into Security Incident Response | conference_paper | 2025-08 | Ajay Narotam, Allison Woodruff, Diana Kramer, Elie Bursztein, Kurt Thomas, Lambert Rosique | raw |
| A Survey on Model Extraction Attacks and Defenses for Large Language Models | paper | 2025-06-26 | Kaixiang Zhao, Kaize Ding, Lincan Li, Neil Zhenqiang Gong, Yue Zhao, Yushun Dong | raw |
| Design Patterns for Securing LLM Agents against Prompt Injections | paper | 2025-06-10 | agents, ai-security, defenses, design-patterns, prompt-injection | raw |
| CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scale | paper | 2025-06-03 | Dawn Song, Jialin Zhang, Jingxuan He, Matthew Cai, Tianneng Shi, Zhun Wang | raw |
| SEC-bench: Automated Benchmarking of LLM Agents on Real-World Security Tasks | paper | 2025-06 | ai-for-security, benchmark, llm-agents, security-tasks, vulnerability-reproduction | raw |
| LLMs in the SOC: An Empirical Study of Human-AI Collaboration in Security Operations Centres | paper | 2025-06 | Cecile Paris, Fatemeh Jalalvand, Martin Lochner, Mohan Baruwal Chhetri, Ronal Singh, Shahroz Tariq | raw |
| Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agents | paper | 2025-05-29 | ai-technology, coding-agents, open-ended-evolution, recursive-self-improvement, self-improving-agents | raw |
| IRCopilot: Automated Incident Response with Large Language Models | paper | 2025-05-27 | Gelei Deng, Jie Zhang, Qing Guo, Riqing Chen, Tianwei Zhang, Tianzhe Liu | raw |
| Collaborative Memory: Multi-User Memory Sharing in LLM Agents with Dynamic Access Control | paper | 2025-05-23 | access-control, auditability, collaborative-memory, mas-misevolution-propagation, multi-agent, provenance | raw |
| On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agents | conference_paper | 2025-05-01 | autoinject, autotransform, challenger, faulty-agents, inspector, mas-misevolution-propagation | raw |
| Towards Secure Systems of Interacting AI Agents | paper | 2025-05 | agent-security, interaction-security, multi-agent, security-for-ai | raw |
| WASP: Benchmarking Web Agent Security Against Prompt Injection Attacks | paper | 2025-04-22 | Aaron Grattafiori, Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Kamalika Chaudhuri, benchmark | raw |
| Alert Fatigue in Security Operations Centres: Research Challenges and Opportunities | journal_paper | 2025-04-04 | Cecile Paris, Mohan Baruwal Chhetri, Shahroz Tariq, Surya Nepal, ai-for-security, ai-soc | raw |
| Severity-based triage of cybersecurity incidents using kill chain attack graphs | journal_paper | 2025-03 | Basel Katt, Lukas Sadlek, Muhammad Mudassar Yamin, Pavel Celeda, ai-for-security, ai-soc | raw |
| CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilities | paper | 2025-03 | ai-for-security, benchmark, cve-bench, exploit-evaluation, llm-agents, web-security | raw |
| AECR: Automatic attack technique intelligence extraction based on fine-tuned large language model | journal_paper | 2025-03 | Bin Lu, Ding Li, Kaijie Zhu, Minghao Chen, Qingjun Yuan, Yuefei Zhu | raw |
| GraphRAG under Fire | paper | 2025-01 | graphrag, poisoning, rag-security, retrieval, security-for-ai | raw |
| Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Models | paper | 2025 | ai-operations, ai-security, glossary-gap, large-language-model, llmops, mlops | raw |
| Model Retraining upon Concept Drift Detection in Network Traffic Data Streams | paper | 2025 | anomaly-detection, concept-drift, glossary-gap, mlops, model-drift, network-security | raw |
| EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit | paper | 2025 | data-exfiltration, enterprise-ai, incident-analysis, prompt-injection, security-for-ai | raw |
| CVE-Bench: Benchmarking LLM-based Software Engineering Agents' Ability to Fix Real-world Vulnerabili | paper | 2025 | ai-for-security, benchmark, cve-bench, software-engineering-agents, vulnerability-repair | raw |
| Agentic AI Security: Threats, Defenses, Evaluation, and Open Challenges | paper | 2025 | agent-security, defenses, evaluation, security-for-ai, survey | raw |
| AI-Augmented SOC: A Survey of LLMs and Agents for Security Operations | paper | 2025 | ai-for-security, alert-triage, incident-response, llm-agents, security-operations, soc | raw |
| AgentOps: Enabling Observability of LLM Agents | paper | 2024-11-08 | Liming Dong, Liming Zhu, Qinghua Lu, agentops, ai-agent-observability, ai-safety | raw |
| Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM Agents | paper | 2024-10-03 | agents, ai-security, benchmark, defenses, memory-poisoning, prompt-injection | raw |
| Large Language Models Can Provide Accurate and Interpretable Incident Triage | conference_paper | 2024-10 | Changhua Pei, Chaoyun Zhang, Chetan Bansal, Dongmei Zhang, Gaogang Xie, Jianhui Li | raw |
| Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Models | paper | 2024-08-15 | ai-for-security, benchmark, capability-evaluation, ctf, cyber-range, llm-agents | raw |
| The AI Scientist: Towards Fully Automated Open-Ended Scientific Discovery | paper | 2024-08-12 | ai-scientist, ai-technology, automated-scientific-discovery, open-ended-research, self-evolving-ai | raw |
| True Attacks, Attack Attempts, or Benign Triggers? An Empirical Measurement of Network Alerts in a S | conference_paper | 2024-08 | ai-for-security, ai-soc, alert-triage, empirical-measurement, ground-truth | raw |
| BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Models | paper | 2024-08 | backdoors, benchmark, llm-security, model-security, security-for-ai | raw |
| AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Bases | paper | 2024-07-17 | agent-memory, backdoor, knowledge-base-poisoning, red-teaming, retrieval-trigger | raw |
| AI-Driven Guided Response for Security Operation Centers with Microsoft Copilot for Security | paper | 2024-07-12 | Amir Gharib, Jovan Kalajdjieski, Robert McCann, Scott Freitas, ai-for-security, ai-soc | raw |
| AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathways | paper | 2024-06-04 | Changzhou Han, Junwu Xiong, Sheng Wen, Wanlun Ma, Yang Xiang, Yongjian Guo | raw |
| Security of AI Agents | paper | 2024-06 | agent-security, architecture, defenses, security-for-ai | raw |
| Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilities | paper | 2024-05-21 | ai-for-security, datasets, genai, incident-response, survey, threat-detection | raw |
| Large Language Models for Cyber Security: A Systematic Literature Review | paper | 2024-05-08 | ai-for-security, llm4security, malware-analysis, survey, threat-intelligence, vulnerability-detection | raw |
| ExpeL: LLM Agents Are Experiential Learners | paper | 2024-03-24 | experience-pool, experiential-learning, faiss, retrieval, successful-trajectories | raw |
| InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agents | paper | 2024-03-05 | agents, ai-security, benchmarks, prompt-injection, tool-use | raw |
| PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Models | paper | 2024-02-12 | Binghui Wang, Jinyuan Jia, Runpeng Geng, Wei Zou, data-poisoning, knowledge-poisoning | raw |
| Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training | paper | 2024-01-10 | Anthropic coauthors, Carson Denison, Evan Hubinger, Jesse Mu, Mike Lambert, backdoors | raw |
| CyberSecEval 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language M | paper | 2024 | ai-for-security, benchmark, cyberseceval, llm-evaluation, offensive-security, security-for-ai | raw |
| Benchmarking Prompt-Injection Attacks on Tool-Integrated LLM Agents | paper | 2024 | ai-security, data-exfiltration, privacy, prompt-injection, tool-integrated-agents | raw |
| Voyager: An Open-Ended Embodied Agent with Large Language Models | paper | 2023-05-25 | automatic-curriculum, embodied-agent, executable-code, lifelong-learning, skill-library | raw |
| Reflexion: Language Agents with Verbal Reinforcement Learning | paper | 2023-03-20 | episodic-memory-buffer, reflection, trajectory, verbal-reinforcement-learning | raw |
| Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Pro | paper | 2023-02-23 | Christoph Endres, Kai Greshake, Mario Fritz, Sahar Abdelnabi, Shailesh Mishra, Thorsten Holz | raw |
| That Escalated Quickly: An ML Framework for Alert Prioritization | preprint | 2023-02-13 | ai-for-security, ai-soc, alert-prioritization, machine-learning, managed-security | raw |
| Context2Vector: Accelerating security event triage via context representation learning | journal_paper | 2022-06 | ai-for-security, ai-soc, alert-triage, human-in-the-loop, representation-learning | raw |
| Improved Detection and Response via Optimized Alerts: Usability Study | journal_paper | 2022-05-31 | ai-for-security, ai-soc, alert-fatigue, machine-learning, usability | raw |
| DEEPCASE: Semi-Supervised Contextual Analysis of Security Events | conference_paper | 2022-05 | ai-for-security, ai-soc, deep-learning, event-correlation, semi-supervised-learning | raw |
| An Assessment of the Usability of Machine Learning Based Tools for the Security Operations Center | preprint | 2020-12-16 | ai-for-security, ai-soc, human-ai-collaboration, machine-learning, usability | raw |
| A user-centric machine learning framework for cyber security operations center | conference_paper | 2017-07 | ai-for-security, ai-soc, alert-triage, machine-learning, user-centric | raw |
| Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents | conference_paper | Shao, Shuai, agent-security, memory, misevolution, self-evolving-agents, tools | raw | |
| SoK: The Attack Surface of Agentic AI -- Tools, and Autonomy | preprint | Dehghantanha, Ali, Homayoun, Sajad, agentic-ai, attack-surface, autonomy, multi-agent-security | raw | |
| MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents | paper | raw | ||
| MAS Misevolution Propagation Collection 2026-06-26 | collection_index | 2026-06-26 | collection, error-cascade, mas-misevolution-propagation, memory-poisoning, multi-agent, self-evolving-agents | raw |
| Explainable AI in Cybersecurity Operations: Lessons Learned from User Studies | paper | 2026-06-16 | analyst-decision-support, cybersecurity-operations, explainable-ai, glossary-gap, soc, xai | raw |
| Bounded Autonomy in the SOC: Mitigating Hallucinations in Agentic Incident Response via Neurosymboli | unknown | raw | ||
| BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents | paper | raw | ||
| AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operations | unknown | raw | ||
| AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agents | paper | raw | ||
| Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agents | paper | raw | ||
| AI Security Paper Collection 2026-06-29 | collection | 2026-06-29 | ai-security, collection, papers, weekly-ingest | raw |
| AI SOC Q1 Journal and Peer-Reviewed Conference Collection | collection_manifest | 2026-06-30 | ai-for-security, ai-soc, collection-manifest, peer-reviewed, q1-journal | raw |
| A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial Sup | survey | Gao, agent-memory, agent-tools, self-evolving-agents, survey, taxonomy | raw |