Han, Ziwen

9 publications

ICLR 2025 Breach by a Thousand Leaks: Unsafe Information Leakage in 'Safe' AI Responses David Glukhov, Ziwen Han, Ilia Shumailov, Vardan Papyan, Nicolas Papernot
ICLR 2025 Planning in Natural Language Improves LLM Search for Code Generation Evan Z Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, William Song, Vaskar Nath, Ziwen Han, Sean M. Hendryx, Summer Yue, Hugh Zhang
ICLR 2025 Teaching LLMs How to Learn with Contextual Fine-Tuning Younwoo Choi, Muhammad Adil Asif, Ziwen Han, John Willes, Rahul Krishnan
NeurIPSW 2024 LLM Defenses Are Not Robust to Multi-Turn Human Jailbreaks yet Nathaniel Li, Ziwen Han, Ian Steneker, Willow E. Primack, Riley Goodside, Hugh Zhang, Zifan Wang, Cristina Menghini, Summer Yue
NeurIPSW 2024 Planning in Natural Language Improves LLM Search for Code Generation Evan Z Wang, Federico Cassano, Catherine Wu, Yunfeng Bai, William Song, Vaskar Nath, Ziwen Han, Sean M. Hendryx, Summer Yue, Hugh Zhang
NeurIPSW 2024 Teaching LLMs How to Learn with Contextual Fine-Tuning Younwoo Choi, Muhammad Adil Asif, Ziwen Han, John Willes, Rahul Krishnan
ICLR 2023 Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
NeurIPSW 2022 Large Language Models Are Human-Level Prompt Engineers Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba
NeurIPSW 2022 Steering Large Language Models Using APE Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba