AI Security Research Portal
sourceactiveSources

Raw Papers Batch Ingest

This batch source note catalogs 185 raw papers markdown sources currently present under vault/01-Raw-Sources/papers/. The raw files remain immutable; this note is the durable ingest handle used by wiki pages, claims, and research questions.

Batch Analysis

Paper Lookup Verification

The batch ingest used existing raw capture metadata to identify titles, source kinds, publication dates, tags, URLs, and summaries. Canonical DOI/arXiv/OpenAlex/Semantic Scholar verification should be refreshed with paper-lookup for any individual paper that is promoted into a stable claim, report, or formal literature review.

Citation Record

This note is a batch-level citation handle for the current raw paper corpus. Individual paper citations remain in the raw capture files and should be normalized with citation-management before publication-grade reuse.

Evidence Quality

The paper corpus is strong for topic discovery and benchmark mapping, but mixed for claim stabilization. Benchmark papers, surveys, and peer-reviewed work should be separated from collection manifests, preliminary preprints, and unverified capture notes before assigning high confidence to claims.

Scholar Evaluation

At batch level, the corpus shows broad coverage of agent security, prompt injection, memory poisoning, RAG security, MCP/protocol security, AI SOC, and benchmark design. Source-specific scholar evaluation remains required before treating a paper as decisive evidence.

Key Claims

Contradictions Or Caveats

Follow-Up Hypotheses

Candidate Research Questions

Source Catalog

TitleKindDateTagsRaw
Prompt Injection in Automated Résumé Screening with Large Language Models: Single and Multi-Injectiopaper2026-06-25Jane Yi Jiang, Jiannan Xu, Preet Baxi, Stefanus Jasin, decision-integrity, hiring-workflowraw
AI Security Paper Collection 2026-06-25collection_manifest2026-06-25agent-security, ai-for-security, ai-security, collection-manifest, jailbreak, papersraw
What Intermediate Layers Know: Detecting Jailbreaks from Entropy Dynamicspaper2026-06-24ai-security, detection, guardrails, jailbreak, mechanistic-interpretabilityraw
Tracing Target Answers in Poisoned Retrieval Corpora via Token Influence Attributionpaper2026-06-24ai-security, attribution, provenance, rag, retrieval-poisoningraw
Security and Privacy in Retrieval-Augmented Generation: Architectures, Threats, Defenses, and Futurepaper2026-06-24ai-security, defense, privacy, rag, survey, threat-modelraw
How Reliable Is Your Jailbreak Judge? Calibration and Adversarial Robustness of Automated ASR Scorinpaper2026-06-24ai-security, asr, calibration, evaluation, jailbreak, llm-judgeraw
AI Snitches Get Glitches: Towards Evading Agentic Surveillancepaper2026-06-24adversarial, agentic-ai, ai-security, monitoring, surveillance-evasionraw
Securing LLM-Agent Long-Term Memory Against Poisoning: Non-Malleable, Origin-Bound Authority with Mapaper2026-06-23agent-memory, ai-security, formal-methods, memory-poisoning, provenanceraw
REALM: A Unified Red-Teaming Benchmark for Physical-World VLMspaper2026-06-23ai-security, benchmark, multimodal-security, red-teaming, vlmraw
Privacy-Preserving RAG via Multi-Agent Semantic Rewriting: Achieving Confidentiality Without Comprompaper2026-06-23ai-security, multi-agent-systems, privacy, rag, semantic-rewritingraw
Poisoned Playbooks: Demystifying Knowledge Poisoning Effects on AI Security Agentspaper2026-06-23ai-for-security, ai-security, knowledge-poisoning, playbooks, soc-agentraw
PixJail: Self-Evolving Paper-to-Pipeline Reproduction for Text-to-Image Jailbreak Evaluationpaper2026-06-23ai-security, evaluation, jailbreak, self-evolving, text-to-imageraw
When AUC 0.998 Is Not Enough: A Candidate Evaluation Protocol for Hidden-State Probes of Indirect Prpaper2026-06-22ai-security, computer-use-agent, evaluation, hidden-state-probes, indirect-prompt-injection, multimodal-agentraw
Self-Evolving Agent Rollout and Experience Buffer Collectioncollection_manifest2026-06-22attack-surface, collection, experience-memory, rollout-buffer, self-evolving-agentraw
RAVEN: Agentic RAG for Automated Vulnerability Repairpaper2026-06-22agentic-rag, ai-for-security, software-security, vulnerability-repairraw
GIF: Locally Sound Geometric Information Flow Control for LLMspaper2026-06-22ai-security, data-leakage, formal-methods, information-flow-control, llmraw
Detecting Malicious Agent Skills in the Wild using Attentionpaper2026-06-22agent-skills, ai-security, detection, malicious-skills, supply-chainraw
Confidently Wrong: Severity-Aware Calibration of Prompt-Injection Detectors under Attack Shiftpaper2026-06-22ai-security, detector-calibration, evaluation, prompt-injection, robustnessraw
AgentLens: Interpretable Safety Steering via Mechanistic Subspaces for Multi-Turn Coding Agentpaper2026-06-22ai-security, coding-agent, interpretability, multi-turn-agent, safety-steeringraw
Safe to Check, Unsafe to Use: Relinking at the Compression Boundary of LLM Agentspaper2026-06-20agent-memory, ai-security, context-compression, llm-agent, prompt-injectionraw
AgentRiskBOM: A Risk-Scoping Security Bill of Materials for Agentic AI Systemspaper2026-06-20agentic-ai, ai-security, governance, risk-management, sbomraw
Agent-Assisted Side-Channel Attacks on Non-Prefix KV Cache in RAGpaper2026-06-20agent-assisted-attack, ai-security, kv-cache, rag, side-channelraw
\"What Happens Locally, Leaks Globally\": Detecting Privacy Leakage Risks in MCP Serverspaper2026-06-19agent-security, ai-security, mcp, privacy-leakage, static-analysis, tool-securityraw
Scalable Hierarchical Attention Transformers for Multi-Turn Jailbreak Detection in Long Conversationpaper2026-06-19ai-security, detection, jailbreak, long-context, multi-turnraw
OTTER: A Red-Teaming System for Toxicity-Evading Jailbreak Prompt Optimizationpaper2026-06-19ai-security, jailbreak, prompt-optimization, red-teaming, safety-evaluationraw
Honeyquest for LLMs: Rethinking Cyber Deception for AI Attackerspaper2026-06-19ai-for-security, cyber-deception, honeypot, llm-attackers, threat-intelligenceraw
AgenticOS: An Intent-Oriented Secure Operating System Architecture for Autonomous AI Agentspaper2026-06-19agentic-ai, ai-security, intent, runtime-security, secure-osraw
Efficient and Sound Probabilistic Verification for AI Agentspaper2026-06-18Alaia Solko-Breslin, Krishnamurthy Dvijotham, Mihai Christodorescu, Pramod Kaushik Mudrakarta, Somesh Jha, datalograw
A Layered Security Framework Against Prompt Injection in RAG-Based Chatbotspaper2026-06-18ai-security, chatbot-security, defense, prompt-injection, ragraw
Ghost Vectors: Soft-Deleted Embeddings Remain Reconstructible in HNSW Vector Databasespaper2026-06-17ai-security, data-deletion, embeddings, privacy, rag, vector-databaseraw
From Privacy to Workflow Integrity: Communication-Graph Metadata in Autonomous Agent Interoperabilitpaper2026-06-17a2a, ai-security, mcp, metadata-leakage, multi-agent-systems, workflow-integrityraw
Code-Augur: Agentic Vulnerability Detection via Specification Inferencepaper2026-06-17agentic-ai, ai-for-security, software-security, specification-inference, vulnerability-detectionraw
SafeClawBench: Separating Semantic, Audit-Evidence, and Sandbox Harm in Tool-Using LLM Agentspaper2026-06-16Chao Xu, Hanting Chen, Haocheng Mei, Mengyu Zheng, Xinghao Chen, Ye Yuanraw
Conflict-Aware Retriever Editing for Knowledge Injection Attacks on LLM-Based RAG Systemspaper2026-06-16ai-security, knowledge-injection, poisoning, rag, retriever-editingraw
An Evaluation of Data Leakage Risks in Tool-Using LLM Agents in Realistic Scenariospaper2026-06-15agent-security, ai-security, data-leakage, privacy, tool-using-agentraw
TrustedARI: Towards Trust-Native Agentic Routing Infrastructure for Agentic AIpaper2026-06-14agentic-ai, ai-security, multi-agent-systems, routing, trust-infrastructureraw
One Goal, Many Commands: Characterizing Denylist Fragility in AI Agentspaper2026-06-14agent-security, ai-security, denylist, policy-enforcement, tool-useraw
Let Them Steal: Trapping Large Language Model Extraction Attacks with Knowledge Honeypotpaper2026-06-14ai-security, defense, honeypot, llm, model-extractionraw
AIChilles: Automatically Uncovering Hidden Weaknesses in AI-Evolved Systemspaper2026-06-14ai-evolved-systems, ai-security, self-evolving-ai, testing, weakness-discoveryraw
Same-Origin Policy for Agentic Browserspaper2026-06-12agentic-browser, ai-security, prompt-injection, same-origin-policy, web-securityraw
Game-Theoretic Multi-Agent Control for Robust Contextual Reasoning in LLMspaper2026-06-12ai-security, context-poisoning, mcp, multi-agent-control, prompt-injection, rollbackraw
AgentCyberRange: Benchmarking Frontier AI Systems in Realistic Cyber Rangespaper2026-06-12ai-for-security, benchmark, cyber-capability, cyber-range, frontier-airaw
SMSR: Certified Defence Against Runtime Memory Poisoning in Persistent LLM Agent Systemspaper2026-06-11agent-memory, ai-security, certified-defense, memory-poisoning, persistent-agentsraw
PI-Hunter: Automated Red-Teaming for Exposing and Localizing Prompt Injectionspaper2026-06-10Ash Fox, George Lee, Jiliang Tang, Lesly Miculicich, Long T. Le, Pengfei Heraw
Influence Factors on RAG Poisoningpaper2026-06-10ai-security, evaluation, poisoning, rag, retrievalraw
Agents All the Way Down; A Methodology for Building Custom AI Agents from Substrate to Productionpaper2026-06-10agent-methodology, agentic-ai, ai-technology, audit-trail, custom-agents, security-boundariesraw
When Poison Fails After Retrieval: Revisiting Corpus Poisoning under Chunking and Reranking Pipelinepaper2026-06-09ai-security, chunking, corpus-poisoning, rag, reranking, retrievalraw
Role-Agent: Bootstrapping LLM Agents via Dual-Role Evolutionpaper2026-06-09co-evolution, curriculum, failure-history, rollout-trajectories, self-evolving-agentraw
Assessing Automated Prompt Injection Attacks in Agentic Environmentspaper2026-06-09David Hofer, Edoardo Debenedetti, Florian Tram챔r, agent-security, agentdojo, benchmarkraw
AgentCanary: A Security Evaluation Framework for Autonomous AI Agents in Real Executable Environmentpaper2026-06-09agent-security, ai-security, benchmark, evaluation, executable-environmentraw
The Injection Paradox: Brand-Level Suppression in Safety-Trained LLM Recommendations via RAG Contextpaper2026-06-08ai-security, context-injection, prompt-injection, rag, recommendationraw
Document-Authored Control-Signal Impersonation: A Low-Cost Indirect Prompt Attack on RAG Safety Bounpaper2026-06-08ai-security, control-signal, indirect-prompt-injection, rag, safety-boundaryraw
RAILS: Verification-Native Clearing For Agentic Commercepaper2026-06-07agent-integrity, agentic-commerce, ai-security, non-human-identity, settlement-risk, verificationraw
GitInject: Real-World Prompt Injection Attacks in AI-Powered CI/CD Pipelinespaper2026-06-07Ilia Shumailov, Jafar Isbarov, Murat Kantarcioglu, Umid Suleymanov, benchmark, ci-cdraw
Data Agents Under Attack: Vulnerabilities in LLM-Driven Analytical Systemspaper2026-06-07Gao Cong, Guoliang Li, Haoyang Li, Kuncan Wang, Peizhuo Lv, Wei Dongraw
OpenAgenet / OAN Yellow Paper: Technical Architecture for Trust-Governed Resource Identity and Discopaper2026-06-05a2a, agent-identity, ai-security, mcp, resource-discovery, skillsraw
ZERO-APT: A Closed-Loop Adversarial Framework for LLM-Driven Automated Penetration Testing under Intpaper2026-06-04Anlan Zheng, Tiantian Zhu, ai-for-security, auditability, cyber-benchmark, defender-in-the-loopraw
What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systpaper2026-06-03Liya Su, Suchen Liu, Tianyun Liu, Tingwen Liu, Yingjie Zhang, Yuanbo Xieraw
NLLog: Lightweight, Explainable SOC Anomaly Detection via Log-to-Language Rewritingpreprint2026-06-03ai-for-security, ai-soc, anomaly-detection, explainability, log-analysisraw
From Untrusted Input to Trusted Memory: A Systematic Study of Memory Poisoning Attacks in LLM Agentspaper2026-06-03Aditi Jain, Pritam Dash, Tanmay Shah, Tongyu Ge, Zhiwei Shang, agent-memoryraw
CyberGym-E2E: Scalable Real-World Benchmark for AI Agents' End-to-End Cybersecurity Capabilitiespaper2026-06-03Alexander Cheung, Chenguang Wang, Dawn Song, Dongwei Jiang, Francisco De La Riega, Gabriel Hanraw
AI Model Extraction Attacks: Bypassing Single-Client Assumptions in Defensespaper2026-06-02Gustavo S찼nchez, Johannes F. Loevenich, Laurin Holz, Maxime Schwarzer, Roberto Rigolin F. Lopes, Thies M철hlenhofraw
An Embarrassingly Simple Detector for Model Extraction Attacks in LLM APIspaper2026-06detection, latest-research, llm-api-security, model-extraction, security-for-airaw
SS-ZKR: Spatial-Semantic Zero-Knowledge Routing for Privacy-Preserving Multi-Agent Collaborationpaper2026-05-31a2a, ai-security, mcp, multi-agent-systems, privacy, routingraw
Benchmarking Security Risk Detection and Verification in Open Agentic Skill Ecosystemspaper2026-05-30Ismail Hossain, Nan Jiang, Sai Puppala, Sajedul Talukder, Zhuoran Lu, agent-skillsraw
Poison with Style: A Practical Poisoning Attack on Code Large Language Modelspaper2026-05-26Issa Khalil, Khang Tran, Md Rizwan Parvez, NhatHai Phan, Ting Yu, Yazan Boshmafraw
MemMorph: Tool Hijacking in LLM Agents via Memory Poisoningpaper2026-05-24accumulated-experience, memory-poisoning, persistent-state, tool-hijacking, tool-selectionraw
SkillOpt: Executive Strategy for Self-Evolving Agent Skillspaper2026-05-22agent-skills, microsoft, self-evolving-agents, skillopt, text-space-optimizationraw
MemAudit: Post-hoc Auditing of Poisoned Agent Memory via Causal Attribution and Structural Anomaly Dpaper2026-05-22anomaly-detection, causal-attribution, counterfactual-replay, memory-audit, poisoningraw
From Raw Experience to Skill Consumption: A Systematic Study of Model-Generated Agent Skillspaper2026-05-22agent-skills, microsoft, model-generated-skills, self-evolving-agents, skilllensraw
Self-Evolving Multi-Agent Systems via Decentralized Memorypaper2026-05-21decentralized-memory, llm-as-a-judge, mas-misevolution-propagation, multi-agent, persistent-memory, self-evolving-agentsraw
LivePI: More Realistic Benchmarking of Agents Against Indirect Prompt Injectionpaper2026-05-18benchmark, executable-harm, indirect-prompt-injection, security-for-ai, virtual-machineraw
AI Agents May Always Fall for Prompt Injectionspaper2026-05-17contextual-integrity, defense-limitations, information-flow, prompt-injection, security-for-airaw
State Contamination in Memory-Augmented LLM Agentspaper2026-05-16mas-misevolution-propagation, memory-laundering, memory-poisoning, multi-agent-rollouts, persistent-state, state-contaminationraw
MemLineage: Lineage-Guided Enforcement for LLM Agent Memorypaper2026-05-14derivation-dag, memory-lineage, merkle-log, provenance, sensitive-action-gateraw
DecodingTrust-Agent Platform (DTap): A Controllable and Interactive Red-Teaming Platform for AI Agenpaper2026-05-06agent-security, benchmark, prompt-injection, red-teaming, security-for-ai, skill-injectionraw
Authorization Propagation in Multi-Agent AI Systems: Identity Governance as Infrastructurepaper2026-05-06authorization, delegation, identity-governance, multi-agent-systems, security-for-airaw
Defending LLM Agents Against Context-Aware Prompt Injectionpaper2026-05-05agents, ai-security, context-aware-attacks, defenses, prompt-injectionraw
You Live More Than Once: Towards Hierarchical Skill Meta-Evolvingpaper2026-05agent-skills, meta-evolving, self-evolving-agents, skill-evolving, test-time-learningraw
Toward Autonomous SOC Operations: End-to-End LLM Framework for Threat Detection, Query Generation, apaper2026-04-30Akramul Azim, Md Hasan Saju, ai-for-security, ai-soc, autonomous-soc, query-generationraw
OpenSOC-AI: Democratizing Security Operations with Parameter Efficient LLM Log Analysispreprint2026-04-29ai-for-security, ai-soc, llm, log-analysis, lora, smbsraw
AgentSOC: A Multi-Layer Agentic AI Framework for Security Operations Automationpaper2026-04-22Joyjit Roy, Samaresh Kumar Singh, agentic-soc, ai-for-security, ai-soc, incident-responseraw
MemEvoBench: Benchmarking Safety Risks from Memory Misevolution in LLM Agentspaper2026-04-17benchmark, biased-feedback, long-horizon-safety, memory-misevolution, noisy-toolsraw
CoEvolve: Training LLM Agents via Agent-Data Mutual Evolutionpaper2026-04-17agent-data-coevolution, forgetting, rollout-trajectories, self-evolving-agent, uncertaintyraw
SIR-Bench: Evaluating Investigation Depth in Security Incident Response Agentspaper2026-04-13Bonan Zheng, Cristian Leo, Daniel Begimher, Jack Huang, Pat Gaw, ai-for-securityraw
Like a Hammer, It Can Build, It Can Break: Large Language Model Uses, Perceptions, and Adoption in Cpaper2026-04-11Aditi Ganapathi, Chih-Yi Huang, Gail-Joon Ahn, Jaron Mink, Kashyap Thimmaraju, Souradip Nathraw
LanG -- A Governance-Aware Agentic AI Platform for Unified Security Operationspreprint2026-04-07agentic-ai, ai-for-security, ai-soc, governance, human-in-the-loop, mcpraw
Security risk management in the digital enterprise: enhancing cyber defense with large language modejournal_paper2026-04-06Abdulrahman Alojail, Samir A. E. Kahouf, Shaymaa Sorour, Shorouk El-Deep, ai-for-security, ai-socraw
Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Ppaper2026-03-23Amin Milani Fard, Charoes Huang, Ngoc Phu Tran, Xin Huang, mcp-security, security-for-airaw
Memory poisoning and secure multi-agent systemspaper2026-03-20episodic-memory, mas-misevolution-propagation, memory-poisoning, multi-agent, secure-mas, semantic-memoryraw
Retrieval-Augmented LLMs for Security Incident Analysispaper2026-03-18Aditya Vikram Singh, Alex Fitts, Alina Oprea, Dirk Van Bruggen, Edward Koh, Harsh Mamaniaraw
SAGE: Multi-Agent Self-Evolution for LLM Reasoningpaper2026-03-16critic, curriculum-pool, multi-agent, self-evolving-agent, verifierraw
How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Compepaper2026-03-16benchmark, computer-use, concealment, indirect-prompt-injection, red-teaming, security-for-airaw
Before You Hand Over the Wheel: Evaluating LLMs for Security Incident Analysispaper2026-03-06Adrian Taylor, Grant Vandenberghe, Madeena Sultana, Sourov Jajodia, Suryadipta Majumdar, agentic-evaluationraw
From Spark to Fire: Modeling and Mitigating Error Cascades in LLM-Based Multi-Agent Collaborationpaper2026-03-04error-cascade, genealogy-graph, llm-mas, mas-misevolution-propagation, multi-agent, propagationraw
ZeroDayBench: Evaluating LLM Agents on Unseen Zero-Day Vulnerabilitiespaper2026-03ai-for-security, benchmark, llm-agents, vulnerability-patching, zero-dayraw
The Attack and Defense Landscape of Agentic AIpaper2026-03agent-security, attack-landscape, defense-landscape, open-challenges, security-for-airaw
Measuring AI Agents' Progress on Multi-Step Cyber Attack Scenariospaper2026-03ai-for-security, benchmark, cyber-range, llm-agents, multi-step-attack, risk-evaluationraw
From Secure Agentic AI to Secure Agentic Web: Challenges, Threats, and Future Directionspaper2026-03agent-security, agentic-web, open-challenges, security-for-airaw
PAIEL: Protocol-Aware and Context-Integrated Protocol Explanation Using LLMs for SOCsconference_paper2026-02-23ai-for-security, ai-soc, context-compression, protocol-analysis, rag, structured-contextraw
Non-Disruptive Disruption: An Empirical Experience of Introducing LLMs in the SOCconference_paper2026-02-23ai-for-security, ai-soc, co-creation, ethnography, human-ai-collaborationraw
Cognitive Threat Detection for SOC Operations: Automating Manipulation Tactic Analysis in Election Sconference_paper2026-02-23ai-for-security, ai-soc, cognitive-threat, election-security, llm-routingraw
SuperLocalMemory: Privacy-Preserving Multi-Agent Memory with Bayesian Trust Defense Against Memory Ppaper2026-02-17architectural-isolation, bayesian-trust, mas-misevolution-propagation, mcp, memory-poisoning, multi-agent-memoryraw
Memory Poisoning Propagation and Repair Mechanism in Multi-Agent Collaborative Environmentspaper2026-02-14contrastive-learning, evidence-graph, mas-misevolution-propagation, memory-poisoning, multi-agent, propagationraw
AgentDyn: Are Your Agent Security Defenses Deployable in Real-World Dynamic Environments?paper2026-02-03agent-security, benchmark, dynamic-tasks, prompt-injection, security-for-airaw
SkillsBench: Benchmarking How Well Agent Skills Work Across Diverse Taskspaper2026-02agent-skills, benchmark, self-evolving-agents, self-generated-skills, skillsbenchraw
Security Threat Modeling for Emerging AI-Agent Protocolspaper2026-02agent-protocols, mcp-security, multi-agent, security-for-ai, threat-modelingraw
DARPA's AI Cyber Challenge (AIxCC): Competition Design, Results, and Lessonspaper2026-02ai-for-security, aixcc, competition, cyber-reasoning-system, patching, vulnerability-discoveryraw
Memory Poisoning Attack and Defense on Memory Based LLM-Agentspaper2026-01-09memory-poisoning, memory-sanitization, query-only-attack, temporal-decay, trust-aware-retrievalraw
Prompt Injection Attacks on Agentic Coding Assistantspaper2026-01agent-security, coding-agents, prompt-injection, security-for-ai, software-supply-chainraw
Experiences of Using Agentic AI to Fill Tooling Gaps in a Security Operations Centerpaper2026Faayed Al Faisal, Kritan Banstola, Xinming Ou, ai-agent, ai-for-security, alert-triageraw
MemoryGraft: Persistent Compromise of LLM Agents via Poisoned Experience Retrievalpaper2025-12-18experience-retrieval, memory-poisoning, persistent-compromise, rollout-buffer-securityraw
AgentEvolver: Towards Efficient Self-Evolving Agent Systempaper2025-11-13experience-pool, reinforcement-learning, rollout-buffer, self-evolving-agent, trajectory-attributionraw
Whisper Leak: a side-channel attack on Large Language Modelspaper2025-11-05Geoff McDonald, Jonathan Bar Or, llm-traffic-analysis, model-security, privacy, security-for-airaw
Securing AI Agent Executionpaper2025-10-24Christoph Bühler, Guido Salvaneschi, Luca Di Grazia, Matteo Biagiola, access-control, agent-securityraw
Carbon Filter: Scalable, Efficient, and Secure Alert Triage for Endpoint Detection & Responseconference_paper2025-10-20Adam Bates, Jonathan Oliver, Muhammad Adil Inam, Raghav Batta, ai-for-security, ai-socraw
Poisoning Attacks on LLMs Require a Near-constant Number of Poison Samplespaper2025-10backdoors, data-poisoning, model-security, security-for-ai, training-data-securityraw
CORTEX: Collaborative LLM Agents for High-Stakes Alert Triagepaper2025-09-30Bowen Wei, Chris Jordan, Howard Liu, Jinhao Pan, Kun Luo, Yuan Shen Tayraw
Large Language Models for Security Operations Centers: A Comprehensive Surveypaper2025-09-13Ali Habibzadeh, Farid Feyzi, Reza Ebrahimi Atani, ai-for-security, ai-soc, llmraw
MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Serverspaper2025-08-19Guanquan Shi, Haifeng Sun, Haohua Du, Haoran Cheng, Suyuan Liu, Xiangyang Liraw
Systematic Analysis of MCP Securitypaper2025-08-18Peng Di, Puzhuo Liu, Sheng Wen, Wanlun Ma, Xi Xiao, Xiaogang Zhuraw
Integrating Large Language Models into Security Incident Responseconference_paper2025-08Ajay Narotam, Allison Woodruff, Diana Kramer, Elie Bursztein, Kurt Thomas, Lambert Rosiqueraw
A Survey on Model Extraction Attacks and Defenses for Large Language Modelspaper2025-06-26Kaixiang Zhao, Kaize Ding, Lincan Li, Neil Zhenqiang Gong, Yue Zhao, Yushun Dongraw
Design Patterns for Securing LLM Agents against Prompt Injectionspaper2025-06-10agents, ai-security, defenses, design-patterns, prompt-injectionraw
CyberGym: Evaluating AI Agents' Cybersecurity Capabilities with Real-World Vulnerabilities at Scalepaper2025-06-03Dawn Song, Jialin Zhang, Jingxuan He, Matthew Cai, Tianneng Shi, Zhun Wangraw
SEC-bench: Automated Benchmarking of LLM Agents on Real-World Security Taskspaper2025-06ai-for-security, benchmark, llm-agents, security-tasks, vulnerability-reproductionraw
LLMs in the SOC: An Empirical Study of Human-AI Collaboration in Security Operations Centrespaper2025-06Cecile Paris, Fatemeh Jalalvand, Martin Lochner, Mohan Baruwal Chhetri, Ronal Singh, Shahroz Tariqraw
Darwin Godel Machine: Open-Ended Evolution of Self-Improving Agentspaper2025-05-29ai-technology, coding-agents, open-ended-evolution, recursive-self-improvement, self-improving-agentsraw
IRCopilot: Automated Incident Response with Large Language Modelspaper2025-05-27Gelei Deng, Jie Zhang, Qing Guo, Riqing Chen, Tianwei Zhang, Tianzhe Liuraw
Collaborative Memory: Multi-User Memory Sharing in LLM Agents with Dynamic Access Controlpaper2025-05-23access-control, auditability, collaborative-memory, mas-misevolution-propagation, multi-agent, provenanceraw
On the Resilience of LLM-Based Multi-Agent Collaboration with Faulty Agentsconference_paper2025-05-01autoinject, autotransform, challenger, faulty-agents, inspector, mas-misevolution-propagationraw
Towards Secure Systems of Interacting AI Agentspaper2025-05agent-security, interaction-security, multi-agent, security-for-airaw
WASP: Benchmarking Web Agent Security Against Prompt Injection Attackspaper2025-04-22Aaron Grattafiori, Arman Zharmagambetov, Chuan Guo, Ivan Evtimov, Kamalika Chaudhuri, benchmarkraw
Alert Fatigue in Security Operations Centres: Research Challenges and Opportunitiesjournal_paper2025-04-04Cecile Paris, Mohan Baruwal Chhetri, Shahroz Tariq, Surya Nepal, ai-for-security, ai-socraw
Severity-based triage of cybersecurity incidents using kill chain attack graphsjournal_paper2025-03Basel Katt, Lukas Sadlek, Muhammad Mudassar Yamin, Pavel Celeda, ai-for-security, ai-socraw
CVE-Bench: A Benchmark for AI Agents' Ability to Exploit Real-World Web Application Vulnerabilitiespaper2025-03ai-for-security, benchmark, cve-bench, exploit-evaluation, llm-agents, web-securityraw
AECR: Automatic attack technique intelligence extraction based on fine-tuned large language modeljournal_paper2025-03Bin Lu, Ding Li, Kaijie Zhu, Minghao Chen, Qingjun Yuan, Yuefei Zhuraw
GraphRAG under Firepaper2025-01graphrag, poisoning, rag-security, retrieval, security-for-airaw
Transitioning from MLOps to LLMOps: Navigating the Unique Challenges of Large Language Modelspaper2025ai-operations, ai-security, glossary-gap, large-language-model, llmops, mlopsraw
Model Retraining upon Concept Drift Detection in Network Traffic Data Streamspaper2025anomaly-detection, concept-drift, glossary-gap, mlops, model-drift, network-securityraw
EchoLeak: The First Real-World Zero-Click Prompt Injection Exploitpaper2025data-exfiltration, enterprise-ai, incident-analysis, prompt-injection, security-for-airaw
CVE-Bench: Benchmarking LLM-based Software Engineering Agents' Ability to Fix Real-world Vulnerabilipaper2025ai-for-security, benchmark, cve-bench, software-engineering-agents, vulnerability-repairraw
Agentic AI Security: Threats, Defenses, Evaluation, and Open Challengespaper2025agent-security, defenses, evaluation, security-for-ai, surveyraw
AI-Augmented SOC: A Survey of LLMs and Agents for Security Operationspaper2025ai-for-security, alert-triage, incident-response, llm-agents, security-operations, socraw
AgentOps: Enabling Observability of LLM Agentspaper2024-11-08Liming Dong, Liming Zhu, Qinghua Lu, agentops, ai-agent-observability, ai-safetyraw
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM Agentspaper2024-10-03agents, ai-security, benchmark, defenses, memory-poisoning, prompt-injectionraw
Large Language Models Can Provide Accurate and Interpretable Incident Triageconference_paper2024-10Changhua Pei, Chaoyun Zhang, Chetan Bansal, Dongmei Zhang, Gaogang Xie, Jianhui Liraw
Cybench: A Framework for Evaluating Cybersecurity Capabilities and Risks of Language Modelspaper2024-08-15ai-for-security, benchmark, capability-evaluation, ctf, cyber-range, llm-agentsraw
The AI Scientist: Towards Fully Automated Open-Ended Scientific Discoverypaper2024-08-12ai-scientist, ai-technology, automated-scientific-discovery, open-ended-research, self-evolving-airaw
True Attacks, Attack Attempts, or Benign Triggers? An Empirical Measurement of Network Alerts in a Sconference_paper2024-08ai-for-security, ai-soc, alert-triage, empirical-measurement, ground-truthraw
BackdoorLLM: A Comprehensive Benchmark for Backdoor Attacks on Large Language Modelspaper2024-08backdoors, benchmark, llm-security, model-security, security-for-airaw
AgentPoison: Red-teaming LLM Agents via Poisoning Memory or Knowledge Basespaper2024-07-17agent-memory, backdoor, knowledge-base-poisoning, red-teaming, retrieval-triggerraw
AI-Driven Guided Response for Security Operation Centers with Microsoft Copilot for Securitypaper2024-07-12Amir Gharib, Jovan Kalajdjieski, Robert McCann, Scott Freitas, ai-for-security, ai-socraw
AI Agents Under Threat: A Survey of Key Security Challenges and Future Pathwayspaper2024-06-04Changzhou Han, Junwu Xiong, Sheng Wen, Wanlun Ma, Yang Xiang, Yongjian Guoraw
Security of AI Agentspaper2024-06agent-security, architecture, defenses, security-for-airaw
Generative AI in Cybersecurity: A Comprehensive Review of LLM Applications and Vulnerabilitiespaper2024-05-21ai-for-security, datasets, genai, incident-response, survey, threat-detectionraw
Large Language Models for Cyber Security: A Systematic Literature Reviewpaper2024-05-08ai-for-security, llm4security, malware-analysis, survey, threat-intelligence, vulnerability-detectionraw
ExpeL: LLM Agents Are Experiential Learnerspaper2024-03-24experience-pool, experiential-learning, faiss, retrieval, successful-trajectoriesraw
InjecAgent: Benchmarking Indirect Prompt Injections in Tool-Integrated Large Language Model Agentspaper2024-03-05agents, ai-security, benchmarks, prompt-injection, tool-useraw
PoisonedRAG: Knowledge Corruption Attacks to Retrieval-Augmented Generation of Large Language Modelspaper2024-02-12Binghui Wang, Jinyuan Jia, Runpeng Geng, Wei Zou, data-poisoning, knowledge-poisoningraw
Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Trainingpaper2024-01-10Anthropic coauthors, Carson Denison, Evan Hubinger, Jesse Mu, Mike Lambert, backdoorsraw
CyberSecEval 3: Advancing the Evaluation of Cybersecurity Risks and Capabilities in Large Language Mpaper2024ai-for-security, benchmark, cyberseceval, llm-evaluation, offensive-security, security-for-airaw
Benchmarking Prompt-Injection Attacks on Tool-Integrated LLM Agentspaper2024ai-security, data-exfiltration, privacy, prompt-injection, tool-integrated-agentsraw
Voyager: An Open-Ended Embodied Agent with Large Language Modelspaper2023-05-25automatic-curriculum, embodied-agent, executable-code, lifelong-learning, skill-libraryraw
Reflexion: Language Agents with Verbal Reinforcement Learningpaper2023-03-20episodic-memory-buffer, reflection, trajectory, verbal-reinforcement-learningraw
Not what you've signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Propaper2023-02-23Christoph Endres, Kai Greshake, Mario Fritz, Sahar Abdelnabi, Shailesh Mishra, Thorsten Holzraw
That Escalated Quickly: An ML Framework for Alert Prioritizationpreprint2023-02-13ai-for-security, ai-soc, alert-prioritization, machine-learning, managed-securityraw
Context2Vector: Accelerating security event triage via context representation learningjournal_paper2022-06ai-for-security, ai-soc, alert-triage, human-in-the-loop, representation-learningraw
Improved Detection and Response via Optimized Alerts: Usability Studyjournal_paper2022-05-31ai-for-security, ai-soc, alert-fatigue, machine-learning, usabilityraw
DEEPCASE: Semi-Supervised Contextual Analysis of Security Eventsconference_paper2022-05ai-for-security, ai-soc, deep-learning, event-correlation, semi-supervised-learningraw
An Assessment of the Usability of Machine Learning Based Tools for the Security Operations Centerpreprint2020-12-16ai-for-security, ai-soc, human-ai-collaboration, machine-learning, usabilityraw
A user-centric machine learning framework for cyber security operations centerconference_paper2017-07ai-for-security, ai-soc, alert-triage, machine-learning, user-centricraw
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agentsconference_paperShao, Shuai, agent-security, memory, misevolution, self-evolving-agents, toolsraw
SoK: The Attack Surface of Agentic AI -- Tools, and AutonomypreprintDehghantanha, Ali, Homayoun, Sajad, agentic-ai, attack-surface, autonomy, multi-agent-securityraw
MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agentspaperraw
MAS Misevolution Propagation Collection 2026-06-26collection_index2026-06-26collection, error-cascade, mas-misevolution-propagation, memory-poisoning, multi-agent, self-evolving-agentsraw
Explainable AI in Cybersecurity Operations: Lessons Learned from User Studiespaper2026-06-16analyst-decision-support, cybersecurity-operations, explainable-ai, glossary-gap, soc, xairaw
Bounded Autonomy in the SOC: Mitigating Hallucinations in Agentic Incident Response via Neurosymboliunknownraw
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agentspaperraw
AgenticCyOps: Securing Multi-Agentic AI Integration in Enterprise Cyber Operationsunknownraw
AgentDojo: A Dynamic Environment to Evaluate Prompt Injection Attacks and Defenses for LLM Agentspaperraw
Agent Security Bench (ASB): Formalizing and Benchmarking Attacks and Defenses in LLM-based Agentspaperraw
AI Security Paper Collection 2026-06-29collection2026-06-29ai-security, collection, papers, weekly-ingestraw
AI SOC Q1 Journal and Peer-Reviewed Conference Collectioncollection_manifest2026-06-30ai-for-security, ai-soc, collection-manifest, peer-reviewed, q1-journalraw
A Survey of Self-Evolving Agents: What, When, How, and Where to Evolve on the Path to Artificial SupsurveyGao, agent-memory, agent-tools, self-evolving-agents, survey, taxonomyraw