Taylor, Mia

2 publications

ICLR 2026 Inoculation Prompting: Eliciting Traits from LLMs During Training Can Reduce Trait Expression at Test-Time Daniel Tan, Anders Cairns Woodruff, Niels Warncke, Arun Jose, Maxime Nicolas Riché, David Demitri Africa, Mia Taylor
ICLR 2026 Strategic Obfuscation of Deceptive Reasoning in Language Models Arun Jose, Niels Warncke, Mia Taylor