Qin, Andrew

1 publications

ICLR 2026 Steering Evaluation-Aware Language Models to Act like They Are Deployed Tim Tian Hua, Andrew Qin, Samuel Marks, Neel Nanda