Moscovitz, Ilan

1 publications

ICLR 2024 How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions Lorenzo Pacchiardi, Alex James Chan, Sören Mindermann, Ilan Moscovitz, Alexa Yue Pan, Yarin Gal, Owain Evans, Jan M. Brauner