Shoeybi, Mohammad

18 publications

NeurIPS 2025 AceReason-Nemotron: Advancing Math and Code Reasoning Through Reinforcement Learning Yang Chen, Zhuolin Yang, Zihan Liu, Chankyu Lee, Peng Xu, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
ICLR 2025 ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Peng Xu, Wei Ping, Xianchao Wu, Chejian Xu, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2025 Efficient Hybrid Language Model Compression Through Group-Aware SSM Pruning Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan, Marcin Chochowski, Yashaswi Karnati, Raviraj Bhuminand Joshi, Ameya Sunil Mahabaleshwarkar, Zijia Chen, Yoshi Suhara, Oluwatobi Olabiyi, Daniel Korzekwa, Mostofa Patwary, Mohammad Shoeybi, Jan Kautz, Bryan Catanzaro, Ashwath Aithal, Nima Tajbakhsh, Pavlo Molchanov
ICLR 2025 MIND: Math Informed syNthetic Dialogues for Pretraining LLMs Syeda Nahida Akter, Shrimai Prabhumoye, John Kamalu, Sanjeev Satheesh, Eric Nyberg, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2025 Mm-Embed: Universal Multimodal Retrieval with Multimodal LLMs Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi, Jimmy Lin, Bryan Catanzaro, Wei Ping
ICLR 2025 NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Chankyu Lee, Rajarshi Roy, Mengyao Xu, Jonathan Raiman, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
NeurIPS 2025 Prismatic Synthesis: Gradient-Based Data Diversification Boosts Generalization in LLM Reasoning Jaehun Jung, Seungju Han, Ximing Lu, Skyler Hallinan, David Acuna, Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi
NeurIPS 2024 ChatQA: Surpassing GPT-4 on Conversational QA and RAG Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Chankyu Lee, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2024 Compact Language Models via Pruning and Knowledge Distillation Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, Pavlo Molchanov
ICML 2024 InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining Boxin Wang, Wei Ping, Lawrence Mcafee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro
ICML 2024 ODIN: Disentangled Reward Mitigates Hacking in RLHF Lichang Chen, Chen Zhu, Jiuhai Chen, Davit Soselia, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2024 RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Yue Yu, Wei Ping, Zihan Liu, Boxin Wang, Jiaxuan You, Chao Zhang, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2024 Retrieval Meets Long Context Large Language Models Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro
CVPR 2024 VILA: On Pre-Training for Visual Language Models Ji Lin, Hongxu Yin, Wei Ping, Pavlo Molchanov, Mohammad Shoeybi, Song Han
NeurIPS 2022 Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models Boxin Wang, Wei Ping, Chaowei Xiao, Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Bo Li, Anima Anandkumar, Bryan Catanzaro
NeurIPS 2022 Factuality Enhanced Language Models for Open-Ended Text Generation Nayeon Lee, Wei Ping, Peng Xu, Mostofa Patwary, Pascale N Fung, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2021 Long-Short Transformer: Efficient Transformers for Language and Vision Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro
ICML 2017 Deep Voice: Real-Time Neural Text-to-Speech Sercan Ö. Arık, Mike Chrzanowski, Adam Coates, Gregory Diamos, Andrew Gibiansky, Yongguo Kang, Xian Li, John Miller, Andrew Ng, Jonathan Raiman, Shubho Sengupta, Mohammad Shoeybi