Catanzaro, Bryan

65 publications

ICLR 2026 AceReason-Nemotron 1.1: Advancing Math and Code Reasoning Through SFT and RL Synergy Zihan Liu, Zhuolin Yang, Yang Chen, Chankyu Lee, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
ICLR 2026 Front-Loading Reasoning: The Synergy Between Pretraining and Post-Training Data Syeda Nahida Akter, Shrimai Prabhumoye, Eric Nyberg, Mostofa Patwary, Mohammad Shoeybi, Yejin Choi, Bryan Catanzaro
ICLR 2026 Music Flamingo: Scaling Music Understanding in Audio Language Models Sreyan Ghosh, Arushi Goel, Lasha Koroshinadze, Sang-gil Lee, Zhifeng Kong, Joao Felipe Santos, Ramani Duraiswami, Dinesh Manocha, Wei Ping, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2026 Nemotron-CC-Math: A 133 Billion-Token-Scale High Quality Math Pretraining Dataset Rabeeh Karimi Mahabadi, Sanjeev Satheesh, Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2026 Nemotron-Research-Tool-N1: Exploring Tool-Using Language Models with Reinforced Reasoning Shaokun Zhang, Yi Dong, Jieyu Zhang, Jan Kautz, Bryan Catanzaro, Andrew Tao, Qingyun Wu, Zhiding Yu, Guilin Liu
ICLR 2026 OmniVinci: Enhancing Architecture and Data for Omni-Modal Understanding LLM Hanrong Ye, Chao-Han Huck Yang, Arushi Goel, Wei Huang, Zhen Wan, Jinchuan Tian, An-Chieh Cheng, Ligeng Zhu, Yuanhang Su, Yuming Lou, Yong-Xiang Lin, Dong Yang, Sreyan Ghosh, Zhijian Liu, Yukang Chen, Ehsan Jahangiri, Ambrish Dantrey, Daguang Xu, Ehsan Hosseini-Asl, Seyed Danial Mohseni Taheri, Vidya Nariyambut Murali, Sifei Liu, Yao Lu, Oluwatobi Olabiyi, Yu-Chiang Frank Wang, Rafael Valle, Bryan Catanzaro, Andrew Tao, Song Han, Jan Kautz, Hongxu Yin, Pavlo Molchanov
ICLR 2026 RLP: Reinforcement as a Pretraining Objective Ali Hatamizadeh, Syeda Nahida Akter, Shrimai Prabhumoye, Jan Kautz, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi
ICLR 2026 TangoFlux: Super Fast and Faithful Text to Audio Generation with Flow Matching and Clap-Ranked Preference Optimization Chia-Yu Hung, Navonil Majumder, Zhifeng Kong, Ambuj Mehrish, Amir Zadeh, Chuan Li, Rafael Valle, Bryan Catanzaro, Soujanya Poria
ICLR 2026 UALM: Unified Audio Language Model for Understanding, Generation and Reasoning Jinchuan Tian, Sang-gil Lee, Zhifeng Kong, Sreyan Ghosh, Arushi Goel, Chao-Han Huck Yang, Wenliang Dai, Zihan Liu, Hanrong Ye, Shinji Watanabe, Mohammad Shoeybi, Bryan Catanzaro, Rafael Valle, Wei Ping
NeurIPS 2025 AceReason-Nemotron: Advancing Math and Code Reasoning Through Reinforcement Learning Yang Chen, Zhuolin Yang, Zihan Liu, Chankyu Lee, Peng Xu, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
ICML 2025 Audio Flamingo 2: An Audio-Language Model with Long-Audio Understanding and Expert Reasoning Abilities Sreyan Ghosh, Zhifeng Kong, Sonal Kumar, S Sakshi, Jaehyeon Kim, Wei Ping, Rafael Valle, Dinesh Manocha, Bryan Catanzaro
NeurIPS 2025 Audio Flamingo 3: Advancing Audio Intelligence with Fully Open Large Audio Language Models Sreyan Ghosh, Arushi Goel, Jaehyeon Kim, Sonal Kumar, Zhifeng Kong, Sang-gil Lee, Chao-Han Huck Yang, Ramani Duraiswami, Dinesh Manocha, Rafael Valle, Bryan Catanzaro
ICLR 2025 ChatQA 2: Bridging the Gap to Proprietary LLMs in Long Context and RAG Capabilities Peng Xu, Wei Ping, Xianchao Wu, Chejian Xu, Zihan Liu, Mohammad Shoeybi, Bryan Catanzaro
ICML 2025 ETTA: Elucidating the Design Space of Text-to-Audio Models Sang-Gil Lee, Zhifeng Kong, Arushi Goel, Sungwon Kim, Rafael Valle, Bryan Catanzaro
NeurIPS 2025 Eagle 2.5: Boosting Long-Context Post-Training for Frontier Vision-Language Models Guo Chen, Zhiqi Li, Shihao Wang, Jindong Jiang, Yicheng Liu, Lidong Lu, De-An Huang, Wonmin Byeon, Matthieu Le, Max Ehrlich, Tong Lu, Limin Wang, Bryan Catanzaro, Jan Kautz, Andrew Tao, Zhiding Yu, Guilin Liu
ICLR 2025 Eagle: Exploring the Design Space for Multimodal LLMs with Mixture of Encoders Min Shi, Fuxiao Liu, Shihao Wang, Shijia Liao, Subhashree Radhakrishnan, Yilin Zhao, De-An Huang, Hongxu Yin, Karan Sapra, Yaser Yacoob, Humphrey Shi, Bryan Catanzaro, Andrew Tao, Jan Kautz, Zhiding Yu, Guilin Liu
NeurIPS 2025 Efficient Hybrid Language Model Compression Through Group-Aware SSM Pruning Ali Taghibakhshi, Sharath Turuvekere Sreenivas, Saurav Muralidharan, Marcin Chochowski, Yashaswi Karnati, Raviraj Bhuminand Joshi, Ameya Sunil Mahabaleshwarkar, Zijia Chen, Yoshi Suhara, Oluwatobi Olabiyi, Daniel Korzekwa, Mostofa Patwary, Mohammad Shoeybi, Jan Kautz, Bryan Catanzaro, Ashwath Aithal, Nima Tajbakhsh, Pavlo Molchanov
ICML 2025 FeatSharp: Your Vision Model Features, Sharper Mike Ranzinger, Greg Heinrich, Pavlo Molchanov, Bryan Catanzaro, Andrew Tao
ICLR 2025 Fugatto 1: Foundational Generative Audio Transformer Opus 1 Rafael Valle, Rohan Badlani, Zhifeng Kong, Sang-gil Lee, Arushi Goel, Sungwon Kim, Joao Felipe Santos, Shuqi Dai, Siddharth Gururani, Aya Aljafari, Alexander H. Liu, Kevin J. Shih, Ryan Prenger, Wei Ping, Chao-Han Huck Yang, Bryan Catanzaro
ICLR 2025 MIND: Math Informed syNthetic Dialogues for Pretraining LLMs Syeda Nahida Akter, Shrimai Prabhumoye, John Kamalu, Sanjeev Satheesh, Eric Nyberg, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2025 Mm-Embed: Universal Multimodal Retrieval with Multimodal LLMs Sheng-Chieh Lin, Chankyu Lee, Mohammad Shoeybi, Jimmy Lin, Bryan Catanzaro, Wei Ping
ICLR 2025 NV-Embed: Improved Techniques for Training LLMs as Generalist Embedding Models Chankyu Lee, Rajarshi Roy, Mengyao Xu, Jonathan Raiman, Mohammad Shoeybi, Bryan Catanzaro, Wei Ping
ICML 2025 Nemotron-CORTEXA: Enhancing LLM Agents for Software Engineering Tasks via Improved Localization and Solution Diversity Atefeh Sohrabizadeh, Jialin Song, Mingjie Liu, Rajarshi Roy, Chankyu Lee, Jonathan Raiman, Bryan Catanzaro
NeurIPS 2025 Prismatic Synthesis: Gradient-Based Data Diversification Boosts Generalization in LLM Reasoning Jaehun Jung, Seungju Han, Ximing Lu, Skyler Hallinan, David Acuna, Shrimai Prabhumoye, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Yejin Choi
CVPR 2025 RADIOv2.5: Improved Baselines for Agglomerative Vision Foundation Models Greg Heinrich, Mike Ranzinger, Hongxu Yin, Yao Lu, Jan Kautz, Andrew Tao, Bryan Catanzaro, Pavlo Molchanov
ICLR 2025 Synthio: Augmenting Small-Scale Audio Classification Datasets with Synthetic Data Sreyan Ghosh, Sonal Kumar, Zhifeng Kong, Rafael Valle, Bryan Catanzaro, Dinesh Manocha
ICLR 2025 UniWav: Towards Unified Pre-Training for Speech Representation Learning and Generation Alexander H. Liu, Sang-gil Lee, Chao-Han Huck Yang, Yuan Gong, Yu-Chiang Frank Wang, James R. Glass, Rafael Valle, Bryan Catanzaro
ICML 2024 Audio Flamingo: A Novel Audio Language Model with Few-Shot Learning and Dialogue Abilities Zhifeng Kong, Arushi Goel, Rohan Badlani, Wei Ping, Rafael Valle, Bryan Catanzaro
NeurIPS 2024 ChatQA: Surpassing GPT-4 on Conversational QA and RAG Zihan Liu, Wei Ping, Rajarshi Roy, Peng Xu, Chankyu Lee, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2024 Compact Language Models via Pruning and Knowledge Distillation Saurav Muralidharan, Sharath Turuvekere Sreenivas, Raviraj Joshi, Marcin Chochowski, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro, Jan Kautz, Pavlo Molchanov
ICML 2024 InstructRetro: Instruction Tuning Post Retrieval-Augmented Pretraining Boxin Wang, Wei Ping, Lawrence Mcafee, Peng Xu, Bo Li, Mohammad Shoeybi, Bryan Catanzaro
WACV 2024 Leveraging Bitstream Metadata for Fast, Accurate, Generalized Compressed Video Quality Enhancement Max Ehrlich, Jon Barker, Namitha Padmanabhan, Larry Davis, Andrew Tao, Bryan Catanzaro, Abhinav Shrivastava
ICML 2024 ODIN: Disentangled Reward Mitigates Hacking in RLHF Lichang Chen, Chen Zhu, Jiuhai Chen, Davit Soselia, Tianyi Zhou, Tom Goldstein, Heng Huang, Mohammad Shoeybi, Bryan Catanzaro
NeurIPS 2024 RankRAG: Unifying Context Ranking with Retrieval-Augmented Generation in LLMs Yue Yu, Wei Ping, Zihan Liu, Boxin Wang, Jiaxuan You, Chao Zhang, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2024 Retrieval Meets Long Context Large Language Models Peng Xu, Wei Ping, Xianchao Wu, Lawrence McAfee, Chen Zhu, Zihan Liu, Sandeep Subramanian, Evelina Bakhturina, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2023 BigVGAN: A Universal Neural Vocoder with Large-Scale Training Sang-gil Lee, Wei Ping, Boris Ginsburg, Bryan Catanzaro, Sungroh Yoon
NeurIPSW 2023 CircuitVAE: Efficient and Scalable Latent Circuit Optimization Jialin Song, Aidan Swope, Robert Kirby, Rajarshi Roy, Saad Godil, Jonathan Raiman, Bryan Catanzaro
NeurIPS 2023 P-Flow: A Fast and Data-Efficient Zero-Shot TTS Through Speech Prompting Sungwon Kim, Kevin Shih, Rohan Badlani, Joao Felipe Santos, Evelina Bakhturina, Mikyas Desta, Rafael Valle, Sungroh Yoon, Bryan Catanzaro
ICCV 2023 Preserve Your Own Correlation: A Noise Prior for Video Diffusion Models Songwei Ge, Seungjun Nah, Guilin Liu, Tyler Poon, Andrew Tao, Bryan Catanzaro, David Jacobs, Jia-Bin Huang, Ming-Yu Liu, Yogesh Balaji
ICLR 2022 Efficient Token Mixing for Transformers via Adaptive Fourier Neural Operators John Guibas, Morteza Mardani, Zongyi Li, Andrew Tao, Anima Anandkumar, Bryan Catanzaro
NeurIPS 2022 Exploring the Limits of Domain-Adaptive Training for Detoxifying Large-Scale Language Models Boxin Wang, Wei Ping, Chaowei Xiao, Peng Xu, Mostofa Patwary, Mohammad Shoeybi, Bo Li, Anima Anandkumar, Bryan Catanzaro
NeurIPS 2022 Factuality Enhanced Language Models for Open-Ended Text Generation Nayeon Lee, Wei Ping, Peng Xu, Mostofa Patwary, Pascale N Fung, Mohammad Shoeybi, Bryan Catanzaro
ICLR 2021 DiffWave: A Versatile Diffusion Model for Audio Synthesis Zhifeng Kong, Wei Ping, Jiaji Huang, Kexin Zhao, Bryan Catanzaro
ICCV 2021 Dual Contrastive Loss and Attention for GANs Ning Yu, Guilin Liu, Aysegul Dundar, Andrew Tao, Bryan Catanzaro, Larry S. Davis, Mario Fritz
ICLR 2021 Flowtron: An Autoregressive Flow-Based Generative Network for Text-to-Speech Synthesis Rafael Valle, Kevin J. Shih, Ryan Prenger, Bryan Catanzaro
NeurIPS 2021 Long-Short Transformer: Efficient Transformers for Language and Vision Chen Zhu, Wei Ping, Chaowei Xiao, Mohammad Shoeybi, Tom Goldstein, Anima Anandkumar, Bryan Catanzaro
ICMLW 2021 RAD-TTS: Parallel Flow-Based TTS with Robust Alignment Learning and Diverse Synthesis Kevin J. Shih, Rafael Valle, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
CVPR 2021 View Generalization for Single Image Textured 3D Models Anand Bhattad, Aysegul Dundar, Guilin Liu, Andrew Tao, Bryan Catanzaro
NeurIPS 2020 Can Q-Learning with Graph Networks Learn a Generalizable Branching Heuristic for a SAT Solver? Vitaly Kurin, Saad Godil, Shimon Whiteson, Bryan Catanzaro
NeurIPS 2020 Neural FFTs for Universal Texture Image Synthesis Morteza Mardani, Guilin Liu, Aysegul Dundar, Shiqiu Liu, Andrew Tao, Bryan Catanzaro
CVPR 2020 Panoptic-Based Image Synthesis Aysegul Dundar, Karan Sapra, Guilin Liu, Andrew Tao, Bryan Catanzaro
NeurIPS 2019 Few-Shot Video-to-Video Synthesis Ting-Chun Wang, Ming-Yu Liu, Andrew Tao, Guilin Liu, Bryan Catanzaro, Jan Kautz
CVPR 2019 Graphical Contrastive Losses for Scene Graph Parsing Ji Zhang, Kevin J. Shih, Ahmed Elgammal, Andrew Tao, Bryan Catanzaro
CVPR 2019 Improving Semantic Segmentation via Video Propagation and Label Relaxation Yi Zhu, Karan Sapra, Fitsum A. Reda, Kevin J. Shih, Shawn Newsam, Andrew Tao, Bryan Catanzaro
ICCV 2019 Unsupervised Video Interpolation Using Cycle Consistency Fitsum A. Reda, Deqing Sun, Aysegul Dundar, Mohammad Shoeybi, Guilin Liu, Kevin J. Shih, Andrew Tao, Jan Kautz, Bryan Catanzaro
CVPR 2018 High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Andrew Tao, Jan Kautz, Bryan Catanzaro
ECCV 2018 Image Inpainting for Irregular Holes Using Partial Convolutions Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, Bryan Catanzaro
ECCV 2018 SDC-Net: Video Prediction Using Spatially-Displaced Convolution Fitsum A. Reda, Guilin Liu, Kevin J. Shih, Robert Kirby, Jon Barker, David Tarjan, Andrew Tao, Bryan Catanzaro
NeurIPS 2018 Video-to-Video Synthesis Ting-Chun Wang, Ming-Yu Liu, Jun-Yan Zhu, Guilin Liu, Andrew Tao, Jan Kautz, Bryan Catanzaro
ICLR 2017 DSD: Dense-Sparse-Dense Training for Deep Neural Networks Song Han, Jeff Pool, Sharan Narang, Huizi Mao, Enhao Gong, Shijian Tang, Erich Elsen, Peter Vajda, Manohar Paluri, John Tran, Bryan Catanzaro, William J. Dally
ICML 2016 Deep Speech 2 : End-to-End Speech Recognition in English and Mandarin Dario Amodei, Sundaram Ananthanarayanan, Rishita Anubhai, Jingliang Bai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Qiang Cheng, Guoliang Chen, Jie Chen, Jingdong Chen, Zhijie Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, Ke Ding, Niandong Du, Erich Elsen, Jesse Engel, Weiwei Fang, Linxi Fan, Christopher Fougner, Liang Gao, Caixia Gong, Awni Hannun, Tony Han, Lappi Johannes, Bing Jiang, Cai Ju, Billy Jun, Patrick LeGresley, Libby Lin, Junjie Liu, Yang Liu, Weigao Li, Xiangang Li, Dongpeng Ma, Sharan Narang, Andrew Ng, Sherjil Ozair, Yiping Peng, Ryan Prenger, Sheng Qian, Zongfeng Quan, Jonathan Raiman, Vinay Rao, Sanjeev Satheesh, David Seetapun, Shubho Sengupta, Kavya Srinet, Anuroop Sriram, Haiyuan Tang, Liliang Tang, Chong Wang, Jidong Wang, Kaifu Wang, Yi Wang, Zhijian Wang, Zhiqian Wang, Shuang Wu, Likai Wei, Bo Xiao, Wen Xie, Yan Xie, Dani Yogatama, Bin Yuan, Jun Zhan, Zhenyao Zhu
ICML 2016 Persistent RNNs: Stashing Recurrent Weights On-Chip Greg Diamos, Shubho Sengupta, Bryan Catanzaro, Mike Chrzanowski, Adam Coates, Erich Elsen, Jesse Engel, Awni Hannun, Sanjeev Satheesh
ICML 2013 Deep Learning with COTS HPC Systems Adam Coates, Brody Huval, Tao Wang, David Wu, Bryan Catanzaro, Ng Andrew
ICCV 2009 Efficient, High-Quality Image Contour Detection Bryan Catanzaro, Bor-Yiing Su, Narayanan Sundaram, Yunsup Lee, Mark Murphy, Kurt Keutzer
ICML 2008 Fast Support Vector Machine Training and Classification on Graphics Processors Bryan Catanzaro, Narayanan Sundaram, Kurt Keutzer