Bukharin, Alexander

11 publications

ICML 2025 Deep Reinforcement Learning from Hierarchical Preference Design Alexander Bukharin, Yixiao Li, Pengcheng He, Tuo Zhao
ICLR 2025 HelpSteer2-Preference: Complementing Ratings with Preferences Zhilin Wang, Alexander Bukharin, Olivier Delalleau, Daniel Egert, Gerald Shen, Jiaqi Zeng, Oleksii Kuchaiev, Yi Dong
NeurIPS 2025 HelpSteer3-Preference: Open Human-Annotated Preference Data Across Diverse Tasks and Languages Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Hoo-Chang Shin, Felipe Soares, Alexander Bukharin, Ellie Evans, Yi Dong, Oleksii Kuchaiev
NeurIPS 2024 Adaptive Preference Scaling for Reinforcement Learning with Human Feedback Ilgee Hong, Zichong Li, Alexander Bukharin, Yixiao Li, Haoming Jiang, Tianbao Yang, Tuo Zhao
ICMLW 2024 RNR: Teaching Large Language Models to Follow Roles and Rules Kuan Wang, Alexander Bukharin, Haoming Jiang, Qingyu Yin, Zhengyang Wang, Tuo Zhao, Jingbo Shang, Chao Zhang, Bing Yin, Xian Li, Jianshu Chen, Shiyang Li
NeurIPS 2024 Robust Reinforcement Learning from Corrupted Human Feedback Alexander Bukharin, Ilgee Hong, Haoming Jiang, Zichong Li, Qingru Zhang, Zixuan Zhang, Tuo Zhao
ICLR 2023 Adaptive Budget Allocation for Parameter-Efficient Fine-Tuning Qingru Zhang, Minshuo Chen, Alexander Bukharin, Pengcheng He, Yu Cheng, Weizhu Chen, Tuo Zhao
ICML 2023 Machine Learning Force Fields with Data Cost Aware Training Alexander Bukharin, Tianyi Liu, Shengjie Wang, Simiao Zuo, Weihao Gao, Wen Yan, Tuo Zhao
NeurIPSW 2023 Machine Learning Force Fields with Data Cost Aware Training Alexander Bukharin, Tianyi Liu, Shengjie Wang, Simiao Zuo, Weihao Gao, Wen Yan, Tuo Zhao
NeurIPS 2023 Robust Multi-Agent Reinforcement Learning via Adversarial Regularization: Theoretical Foundation and Stable Algorithms Alexander Bukharin, Yan Li, Yue Yu, Qingru Zhang, Zhehui Chen, Simiao Zuo, Chao Zhang, Songan Zhang, Tuo Zhao
ICML 2022 PLATON: Pruning Large Transformer Models with Upper Confidence Bound of Weight Importance Qingru Zhang, Simiao Zuo, Chen Liang, Alexander Bukharin, Pengcheng He, Weizhu Chen, Tuo Zhao