Zhang, Hanlin

24 publications

NeurIPS 2025 AlgoTune: Can Language Models Speed up General-Purpose Numerical Programs? Ori Press, Brandon Amos, Haoyu Zhao, Yikai Wu, Samuel Ainsworth, Dominik Krupke, Patrick Kidger, Touqir Sajed, Bartolomeo Stellato, Jisun Park, Nathanael Bosch, Eli Meril, Albert Steppi, Arman Zharmagambetov, Fangzhao Zhang, David Pérez-Piñeiro, Alberto Mercurio, Ni Zhan, Talor Abramovich, Kilian Lieret, Hanlin Zhang, Shirley Huang, Matthias Bethge, Ofir Press
ICLR 2025 Eliminating Position Bias of Language Models: A Mechanistic Approach Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji
NeurIPS 2025 EvoLM: In Search of Lost Training Dynamics for Language Model Reasoning Zhenting Qi, Fan Nie, Alexandre Alahi, James Zou, Himabindu Lakkaraju, Yilun Du, Eric P. Xing, Sham M. Kakade, Hanlin Zhang
ICLR 2025 Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems Zhenting Qi, Hanlin Zhang, Eric P. Xing, Sham M. Kakade, Himabindu Lakkaraju
ICLR 2025 How Does Critical Batch Size Scale in Pre-Training? Hanlin Zhang, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham M. Kakade
ICLR 2025 Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Yuda Song, Hanlin Zhang, Carson Eisenach, Sham M. Kakade, Dean Foster, Udaya Ghai
NeurIPS 2024 CoLoR-Filter: Conditional Loss Reduction Filtering for Targeted Language Model Pre-Training David Brandfonbrener, Hanlin Zhang, Andreas Kirsch, Jonathan Richard Schwarz, Sham Kakade
NeurIPSW 2024 Connections Between Schedule-Free SGD, Accelerated SGD Variants, and Weight Averaging Depen Morwani, Nikhil Vyas, Hanlin Zhang, Sham M. Kakade
NeurIPS 2024 DataComp-LM: In Search of the Next Generation of Training Sets for Language Models Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar
NeurIPSW 2024 Eliminating Position Bias of Language Models: A Mechanistic Approach Ziqi Wang, Hanlin Zhang, Xiner Li, Kuan-Hao Huang, Chi Han, Shuiwang Ji, Sham M. Kakade, Hao Peng, Heng Ji
ICLRW 2024 Follow My Instruction and Spill the Beans: Scalable Data Extraction from Retrieval-Augmented Generation Systems Zhenting Qi, Hanlin Zhang, Eric P. Xing, Sham M. Kakade, Himabindu Lakkaraju
NeurIPSW 2024 How Does Critical Batch Size Scale in Pre-Training? Hanlin Zhang, Depen Morwani, Nikhil Vyas, Jingfeng Wu, Difan Zou, Udaya Ghai, Dean Foster, Sham M. Kakade
NeurIPSW 2024 Mind the Gap: Examining the Self-Improvement Capabilities of Large Language Models Yuda Song, Hanlin Zhang, Carson Eisenach, Sham M. Kakade, Dean Foster, Udaya Ghai
ICLRW 2024 Watermarks in the Sand: Impossibility of Strong Watermarking for Generative Models Hanlin Zhang, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak
ICML 2024 Watermarks in the Sand: Impossibility of Strong Watermarking for Language Models Hanlin Zhang, Benjamin L. Edelman, Danilo Francati, Daniele Venturi, Giuseppe Ateniese, Boaz Barak
NeurIPSW 2023 A Study on the Calibration of In-Context Learning Hanlin Zhang, YiFan Zhang, Yaodong Yu, Dhruv Madeka, Dean Foster, Eric P. Xing, Himabindu Lakkaraju, Sham M. Kakade
ICML 2023 Do the Rewards Justify the Means? Measuring Trade-Offs Between Rewards and Ethical Behavior in the Machiavelli Benchmark Alexander Pan, Jun Shern Chan, Andy Zou, Nathaniel Li, Steven Basart, Thomas Woodside, Hanlin Zhang, Scott Emmons, Dan Hendrycks
TMLR 2023 Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation YiFan Zhang, Hanlin Zhang, Zachary Chase Lipton, Li Erran Li, Eric Xing
NeurIPSW 2022 Exploring Transformer Backbones for Heterogeneous Treatment Effect Estimation YiFan Zhang, Hanlin Zhang, Zachary Chase Lipton, Li Erran Li, Eric Xing
ICMLW 2022 Improved Logical Reasoning of Language Models via Differentiable Symbolic Programming Hanlin Zhang, Ziyang Li, Jiani Huang, Mayur Naik, Eric Xing
NeurIPSW 2022 The Impact of Symbolic Representations on In-Context Learning for Few-Shot Reasoning Hanlin Zhang, YiFan Zhang, Li Erran Li, Eric Xing
UAI 2022 Toward Learning Human-Aligned Cross-Domain Robust Models by Countering Misaligned Features Haohan Wang, Zeyi Huang, Hanlin Zhang, Yong Jae Lee, Eric P. Xing
CVPR 2022 Towards Principled Disentanglement for Domain Generalization Hanlin Zhang, Yi-Fan Zhang, Weiyang Liu, Adrian Weller, Bernhard Schölkopf, Eric P. Xing
NeurIPS 2020 Towards Interpretable Natural Language Understanding with Explanations as Latent Variables Wangchunshu Zhou, Jinyi Hu, Hanlin Zhang, Xiaodan Liang, Maosong Sun, Chenyan Xiong, Jian Tang