Capture Notes
Paper on detecting jailbreaks through entropy dynamics in intermediate model layers.
AI security relevance:
- Useful for model-internal detection and guardrail design beyond surface prompt filtering.
- Can inform topics on jailbreak detection, safety steering, and evaluation robustness.