AI Security Research Portal
Sourcessourceseed2026-07-04ai-securityai-for-securitybenchmarkctfcyber-rangellm-agentscapability-evaluation

Capture Summary

Benchmark framework for evaluating language-model agents on professional-level CTF tasks, including task environments, subtasks, and model/agent scaffold comparisons.

Relevance

Collection Notes