AI Security Research Portal
Sourcessourceseed2026-07-04ai-securityai-for-securitybenchmarkllm-agentssecurity-tasksvulnerability-reproduction

Capture Summary

Paper proposing automated benchmarking of LLM agents on real-world security tasks, including scalable task construction and vulnerability reproduction.

Relevance

Collection Notes