Open Weight AI SOC Paper Collection
Untrusted source collection. Use individual source records for claims.
Scope
This collection tracks papers and thesis/report-style sources where open-weight or open-source models are used, evaluated, audited, or proposed for SOC/security-operations workflows.
Newly Collected Sources
| Source | Kind | Open-weight relevance | Raw file |
|---|---|---|---|
| When the Ruler is Broken: Parsing-Induced Suppression in LLM-Based Security Log Evaluation | preprint | Audits OpenSOC-AI/TinyLlama evaluation and proposes SOC-Bench v0. | when-ruler-broken-parsing-induced-suppression-soc-log-eval-2026 |
| Llama-3.1-FoundationAI-SecurityLLM-Base-8B Technical Report | preprint | Open-weight cybersecurity foundation model built on Llama 3.1. | foundation-sec-8b-base-technical-report-2025 |
| Llama-3.1-FoundationAI-SecurityLLM-8B-Instruct Technical Report | preprint | Instruction-tuned open-weight cybersecurity assistant model for professional workflows. | foundation-sec-8b-instruct-technical-report-2025 |
| Evaluation of LLM Agents for the SOC Tier 1 Analyst Triage Process | thesis | SOC Tier 1 triage workflow includes Llama 3 70B model use. | soc-tier1-llm-agents-triage-thesis-2024 |
Existing Related Sources Already In Vault
| Source | Why relevant |
|---|---|
| opensoc-ai-parameter-efficient-log-analysis-2026 | TinyLlama-1.1B + LoRA for threat classification, MITRE ATT&CK mapping, and severity assessment from raw logs. |
| soc-ai-companion-wosoc-2026 | Locally deployed SOC assistant paper; current source should be rechecked for gpt-oss/open-weight deployment details. |
| journal-big-data-llm-soc-risk-management-2026 | Compares GPT-3.5 Turbo and Mistral-7B on SOC-oriented network telemetry tasks. |
| siabench-security-incident-analysis-2026 | Evaluates open-weight and closed-weight LLMs for security incident analysis with SIA Agent/SIABENCH. |
| e-mantra-cognitive-threat-detection-soc-2026 | WOSOC 2026 SOC-oriented LLM routing paper; mentions open-source alternatives such as Llama-3 70B for API dependency mitigation. |
Recommended Ingest Questions
- Do open-weight models materially improve privacy and cost constraints for SOC deployment without sacrificing incident-analysis reliability?
- Which SOC tasks are plausible for small/local models: log classification, ATT&CK mapping, query generation, alert triage, CTI enrichment, or response recommendation?
- How should evaluations avoid parser-induced score distortion, dataset leakage, and overly synthetic alert workflows?
- When should a SOC prefer domain-specialized open-weight models over general frontier APIs?