Halawi, Danny

4 publications

ICLR 2025 ForecastBench: A Dynamic Benchmark of AI Forecasting Capabilities Ezra Karger, Houtan Bastani, Chen Yueh-Han, Zachary Jacobs, Danny Halawi, Fred Zhang, Philip Tetlock
NeurIPS 2024 Approaching Human-Level Forecasting with Language Models Danny Halawi, Fred Zhang, Chen Yueh-Han, Jacob Steinhardt
ICML 2024 Covert Malicious Finetuning: Challenges in Safeguarding LLM Adaptation Danny Halawi, Alexander Wei, Eric Wallace, Tony Tong Wang, Nika Haghtalab, Jacob Steinhardt
ICLR 2024 Overthinking the Truth: Understanding How Language Models Process False Demonstrations Danny Halawi, Jean-Stanislas Denain, Jacob Steinhardt