AI Security Research Portal
methodactiveMethods

Benchmark-Based Security Evaluation

Purpose

Use benchmark papers to compare attacks, defenses, models, tasks, and deployment assumptions. Track benchmark scope, realism, contamination risk, and transfer to production incidents.

Evidence Base

Representative Sources

TitleKindDateTagsRaw
Latest AI Security Collection 2026-06-25collection_manifest2026-06-25agentic-ai, ai-for-security, ai-security, collection-manifest, latest, mcpraw
AI Security Paper Collection 2026-06-25collection_manifest2026-06-25agent-security, ai-for-security, ai-security, collection-manifest, jailbreak, papersraw
AWS Security Agent adds threat modeling, Kiro power and Claude Code plugin, and moreofficial_blog2026-06-17ai-for-security, code-review, coding-agents, mcp, security-agent, strideraw
Securing the Agentic AI Frontier: Palo Alto Networks and Databricks Deliver a New Standard for AI Seofficial_blog2026-06-16agentic-ai, ai-gateway, data-security, governance, mcp, runtime-securityraw
AgentCanary: A Security Evaluation Framework for Autonomous AI Agents in Real Executable Environmentpaper2026-06-09agent-security, ai-security, benchmark, evaluation, executable-environmentraw
State of Agentic AI Security and Governance 2.01official_whitepaper2026-06-01OWASP Gen AI Security Project, agentic-ai, governance, owasp, security-for-ai, standardsraw
Model Context Protocol (MCP): Security Design Considerations for AI-Driven Automationgovernment_guidance2026-05-20NSA Artificial Intelligence Security Center, agent-security, automation, government-guidance, mcp-security, security-for-airaw
The State of AI Cybersecurity 2026vendor_report2026ai-for-security, ai-security, ciso-survey, industry-report, security-operations, socraw

Use In This Portal

Apply this method when ingesting sources, evaluating claims, or answering research questions that need evidence discipline rather than narrative summary.