Tu, Yuheng

4 publications

ICLR 2025 AIR-BENCH 2024: A Safety Benchmark Based on Regulation and Policies Specified Risk Categories Yi Zeng, Yu Yang, Andy Zhou, Jeffrey Ziwei Tan, Yuheng Tu, Yifan Mai, Kevin Klyman, Minzhou Pan, Ruoxi Jia, Dawn Song, Percy Liang, Bo Li
NeurIPS 2025 Fantastic Bugs and Where to Find Them in AI Benchmarks Sang T. Truong, Yuheng Tu, Michael Hardy, Anka Reuel, Zeyu Tang, Jirayu Burapacheep, Jonathan Jude Perera, Chibuike Uwakwe, Benjamin W. Domingue, Nick Haber, Sanmi Koyejo
ICML 2025 Reliable and Efficient Amortized Model-Based Evaluation Sang T. Truong, Yuheng Tu, Percy Liang, Bo Li, Sanmi Koyejo
ICLRW 2025 Reliable and Efficient Amortized Model-Based Evaluation Sang T. Truong, Yuheng Tu, Percy Liang, Bo Li, Sanmi Koyejo