Sources

Capture Notes

Paper on interpretable safety steering for multi-turn coding agents through mechanistic subspaces.

AI security relevance:

Relevant to coding-agent runtime safety and multi-turn behavior control.
Connects to existing work on coding agent exploitation, prompt injection, and tool execution risk.