ML Anthology
Authors
Search
About
Pan, Alexa Yue
1 publications
ICLR
2024
How to Catch an AI Liar: Lie Detection in Black-Box LLMs by Asking Unrelated Questions
Lorenzo Pacchiardi
,
Alex James Chan
,
Sören Mindermann
,
Ilan Moscovitz
,
Alexa Yue Pan
,
Yarin Gal
,
Owain Evans
,
Jan M. Brauner