ML Anthology
Authors
Search
About
Soligo, Anna
2 publications
ICLR
2026
Emergent Misalignment Is Easy, Narrow Misalignment Is Hard
Anna Soligo
,
Edward Turner
,
Senthooran Rajamanoharan
,
Neel Nanda
ICML
2025
Inducing, Detecting and Characterising Neural Modules: A Pipeline for Functional Interpretability in Reinforcement Learning
Anna Soligo
,
Pietro Ferraro
,
David Boyle