Athiwaratkun, Ben

13 publications

ICML 2025 Improving Model Alignment Through Collective Intelligence of Open-Source Models Junlin Wang, Roy Xie, Shang Zhu, Jue Wang, Ben Athiwaratkun, Bhuwan Dhingra, Shuaiwen Leon Song, Ce Zhang, James Zou
ICML 2025 Ladder-Residual: Parallelism-Aware Architecture for Accelerating Large Model Inference with Communication Overlapping Muru Zhang, Mayank Mishra, Zhongzhu Zhou, William Brandon, Jue Wang, Yoon Kim, Jonathan Ragan-Kelley, Shuaiwen Leon Song, Ben Athiwaratkun, Tri Dao
ICLR 2025 Mixture-of-Agents Enhances Large Language Model Capabilities Junlin Wang, Jue Wang, Ben Athiwaratkun, Ce Zhang, James Zou
ICLR 2025 Scaling Instruction-Tuned LLMs to Million-Token Contexts via Hierarchical Synthetic Data Generation Linda He, Jue Wang, Maurice Weber, Shang Zhu, Ben Athiwaratkun, Ce Zhang
ICLR 2025 Training-Free Activation Sparsity in Large Language Models James Liu, Pragaash Ponnusamy, Tianle Cai, Han Guo, Yoon Kim, Ben Athiwaratkun
NeurIPS 2025 Weaver: Shrinking the Generation-Verification Gap by Scaling Compute for Verification Jon Saad-Falcon, E. Kelly Buchanan, Mayee F Chen, Tzu-Heng Huang, Brendan McLaughlin, Tanvir Bhathal, Shang Zhu, Ben Athiwaratkun, Frederic Sala, Scott Linderman, Azalia Mirhoseini, Christopher Re
ICML 2024 Bifurcated Attention for Single-Context Large-Batch Sampling Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Jiacheng Guo, Liangfu Chen, Parminder Bhatia, Ramesh Nallapati, Sudipta Sengupta, Bing Xiang
NeurIPS 2024 RedPajama: An Open Dataset for Training Large Language Models Maurice Weber, Daniel Y. Fu, Quentin Anthony, Yonatan Oren, Shane Adams, Anton Alexandrov, Xiaozhong Lyu, Huu Nguyen, Xiaozhe Yao, Virginia Adams, Ben Athiwaratkun, Rahul Chalamala, Kezhen Chen, Max Ryabinin, Tri Dao, Percy Liang, Christopher RĂ©, Irina Rish, Ce Zhang
ICLR 2023 Multi-Lingual Evaluation of Code Generation Models Ben Athiwaratkun, Sanjay Krishna Gouda, Zijian Wang, Xiaopeng Li, Yuchen Tian, Ming Tan, Wasi Uddin Ahmad, Shiqi Wang, Qing Sun, Mingyue Shang, Sujan Kumar Gonugondla, Hantian Ding, Varun Kumar, Nathan Fulton, Arash Farahani, Siddhartha Jain, Robert Giaquinto, Haifeng Qian, Murali Krishna Ramanathan, Ramesh Nallapati, Baishakhi Ray, Parminder Bhatia, Sudipta Sengupta, Dan Roth, Bing Xiang
ICMLW 2023 On IO-Efficient Attention Mechanisms: Context-Aware Bifurcated Attention and the Generalized Multi-Group Attention Ben Athiwaratkun, Sujan Kumar Gonugondla, Sanjay Krishna Gouda, Haifeng Qian, Hantian Ding, Qing Sun, Jun Wang, Liangfu Chen, Jiacheng Guo, Parminder Bhatia, Ramesh Nallapati, Sudipta Sengupta, Bing Xiang
ICLR 2021 Structured Prediction as Translation Between Augmented Natural Languages Giovanni Paolini, Ben Athiwaratkun, Jason Krone, Jie Ma, Alessandro Achille, Rishita Anubhai, Cicero Nogueira dos Santos, Bing Xiang, Stefano Soatto
ICLR 2019 There Are Many Consistent Explanations of Unlabeled Data: Why You Should Average Ben Athiwaratkun, Marc Finzi, Pavel Izmailov, Andrew Gordon Wilson
ICLR 2018 Hierarchical Density Order Embeddings Ben Athiwaratkun, Andrew Gordon Wilson