McAvoy, Alex

1 publications

ICLR 2026 PropensityBench: Evaluating Latent Safety Risks in Large Language Models via an Agentic Approach Udari Madhushani Sehwag, Shayan Shabihi, Alex McAvoy, Vikash Sehwag, Yuancheng Xu, Dalton Towers, Furong Huang