Zhong, Ruiqi

13 publications

ICLR 2025 Language Models Learn to Mislead Humans via RLHF Jiaxin Wen, Ruiqi Zhong, Akbir Khan, Ethan Perez, Jacob Steinhardt, Minlie Huang, Samuel R. Bowman, He He, Shi Feng
ICMLW 2024 AdaptiveBackdoor: Backdoored Language Model Agents That Detect Human Overseers Heng Wang, Ruiqi Zhong, Jiaxin Wen, Jacob Steinhardt
ICMLW 2024 AdaptiveBackdoor: Backdoored Language Model Agents That Detect Human Overseers Heng Wang, Ruiqi Zhong, Jiaxin Wen, Jacob Steinhardt
CVPR 2024 Describing Differences in Image Sets with Natural Language Lisa Dunlap, Yuhui Zhang, Xiaohan Wang, Ruiqi Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy
ICML 2024 Do Models Explain Themselves? Counterfactual Simulatability of Natural Language Explanations Yanda Chen, Ruiqi Zhong, Narutatsu Ri, Chen Zhao, He He, Jacob Steinhardt, Zhou Yu, Kathleen Mckeown
NeurIPS 2024 Explaining Datasets in Words: Statistical Models with Natural Language Parameters Ruiqi Zhong, Heng Wang, Dan Klein, Jacob Steinhardt
TMLR 2024 Foundational Challenges in Assuring Alignment and Safety of Large Language Models Usman Anwar, Abulhair Saparov, Javier Rando, Daniel Paleka, Miles Turpin, Peter Hase, Ekdeep Singh Lubana, Erik Jenner, Stephen Casper, Oliver Sourbut, Benjamin L. Edelman, Zhaowei Zhang, Mario Günther, Anton Korinek, Jose Hernandez-Orallo, Lewis Hammond, Eric J Bigelow, Alexander Pan, Lauro Langosco, Tomasz Korbak, Heidi Chenyu Zhang, Ruiqi Zhong, Sean O hEigeartaigh, Gabriel Recchia, Giulio Corsi, Alan Chan, Markus Anderljung, Lilian Edwards, Aleksandar Petrov, Christian Schroeder de Witt, Sumeet Ramesh Motwani, Yoshua Bengio, Danqi Chen, Philip Torr, Samuel Albanie, Tegan Maharaj, Jakob Nicolaus Foerster, Florian Tramèr, He He, Atoosa Kasirzadeh, Yejin Choi, David Krueger
ICML 2023 DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation Yuhang Lai, Chengxi Li, Yiming Wang, Tianyi Zhang, Ruiqi Zhong, Luke Zettlemoyer, Wen-Tau Yih, Daniel Fried, Sida Wang, Tao Yu
NeurIPS 2023 Goal Driven Discovery of Distributional Differences via Language Descriptions Ruiqi Zhong, Peter Zhang, Steve Li, Jinwoo Ahn, Dan Klein, Jacob Steinhardt
ICLR 2023 InCoder: A Generative Model for Code Infilling and Synthesis Daniel Fried, Armen Aghajanyan, Jessy Lin, Sida Wang, Eric Wallace, Freda Shi, Ruiqi Zhong, Scott Yih, Luke Zettlemoyer, Mike Lewis
ICML 2022 Describing Differences Between Text Distributions with Natural Language Ruiqi Zhong, Charlie Snell, Dan Klein, Jacob Steinhardt
NeurIPSW 2021 The Effect of Model Size on Worst-Group Generalization Alan Le Pham, Eunice Chan, Vikranth Srivatsa, Dhruba Ghosh, Yaoqing Yang, Yaodong Yu, Ruiqi Zhong, Joseph E. Gonzalez, Jacob Steinhardt
ICML 2018 Subspace Embedding and Linear Regression with Orlicz Norm Alexandr Andoni, Chengyu Lin, Ying Sheng, Peilin Zhong, Ruiqi Zhong