Siu, Vincent

1 publications

ICLR 2026 RepIt: Steering Language Models with Concept-Specific Refusal Vectors Vincent Siu, Nathan W. Henry, Nicholas Crispino, Yang Liu, Dawn Song, Chenguang Wang