SOC Evaluation Parser Audit
Purpose
A SOC evaluation parser audit checks whether the scoring pipeline for LLM-based SOC tasks faithfully extracts model outputs, maps them to the intended taxonomy, and reports uncertainty without suppressing valid answers.
When To Use
- Evaluating open-weight log-analysis or alert-triage models.
- Comparing closed and open models on structured SOC tasks.
- Reproducing claims from SOC benchmarks where model outputs are parsed into labels, severities, ATT&CK techniques, or response categories.
Procedure
- Preserve raw model outputs before parsing.
- Define the expected output schema and allowed aliases.
- Test strict parsing and fuzzy parsing side by side.
- Manually audit a stratified sample of parse failures.
- Report invalid-output rate separately from wrong-answer rate.
- Publish parser code, taxonomy, prompt template, and scoring script.
Evidence Base
- SRC-20260703-open-weight-ai-soc
- when-ruler-broken-parsing-induced-suppression-soc-log-eval-2026
- opensoc-ai-parameter-efficient-log-analysis-2026