Shiromani, Shikhar

1 publications

ACML 2025 ChameleonBench: Quantifying Alignment Faking in Large Language Models Archie Chaudhury, Shikhar Shiromani