Shankar, Vaishaal

29 publications

NeurIPS 2025 Datasets, Documents, and Repetitions: The Practicalities of Unequal Data Quality Alex Fang, Hadi Pouransari, Matt Jordan, Alexander T Toshev, Vaishaal Shankar, Ludwig Schmidt, Tom Gunter
ICLR 2025 Language Models Scale Reliably with Over-Training and on Downstream Tasks Samir Yitzhak Gadre, Georgios Smyrnis, Vaishaal Shankar, Suchin Gururangan, Mitchell Wortsman, Rulin Shao, Jean Mercat, Alex Fang, Jeffrey Li, Sedrick Keh, Rui Xin, Marianna Nezhurina, Igor Vasiljevic, Luca Soldaini, Jenia Jitsev, Alex Dimakis, Gabriel Ilharco, Pang Wei Koh, Shuran Song, Thomas Kollar, Yair Carmon, Achal Dave, Reinhard Heckel, Niklas Muennighoff, Ludwig Schmidt
TMLR 2025 MobileCLIP2: Improving Multi-Modal Reinforced Training Fartash Faghri, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Alexander T Toshev, Oncel Tuzel, Hadi Pouransari
ICLR 2024 Data Filtering Networks Alex Fang, Albin Madappally Jose, Amit Jain, Ludwig Schmidt, Alexander T Toshev, Vaishaal Shankar
NeurIPS 2024 DataComp-LM: In Search of the Next Generation of Training Sets for Language Models Jeffrey Li, Alex Fang, Georgios Smyrnis, Maor Ivgi, Matt Jordan, Samir Gadre, Hritik Bansal, Etash Guha, Sedrick Keh, Kushal Arora, Saurabh Garg, Rui Xin, Niklas Muennighoff, Reinhard Heckel, Jean Mercat, Mayee Chen, Suchin Gururangan, Mitchell Wortsman, Alon Albalak, Yonatan Bitton, Marianna Nezhurina, Amro Abbas, Cheng-Yu Hsieh, Dhruba Ghosh, Josh Gardner, Maciej Kilian, Hanlin Zhang, Rulin Shao, Sarah Pratt, Sunny Sanyal, Gabriel Ilharco, Giannis Daras, Kalyani Marathe, Aaron Gokaslan, Jieyu Zhang, Khyathi Chandu, Thao Nguyen, Igor Vasiljevic, Sham Kakade, Shuran Song, Sujay Sanghavi, Fartash Faghri, Sewoong Oh, Luke Zettlemoyer, Kyle Lo, Alaaeldin El-Nouby, Hadi Pouransari, Alexander Toshev, Stephanie Wang, Dirk Groeneveld, Luca Soldaini, Pang Wei Koh, Jenia Jitsev, Thomas Kollar, Alexandros G. Dimakis, Yair Carmon, Achal Dave, Ludwig Schmidt, Vaishaal Shankar
NeurIPS 2024 Dataset Decomposition: Faster LLM Training with Variable Sequence Length Curriculum Hadi Pouransari, Chun-Liang Li, Jen-Hao Rick Chang, Pavan Kumar Anasosalu Vasu, Cem Koc, Vaishaal Shankar, Oncel Tuzel
TMLR 2024 Interpreting CLIP: Insights on the Robustness to ImageNet Distribution Shifts Jonathan Crabbé, Pau Rodriguez, Vaishaal Shankar, Luca Zappella, Arno Blaas
ICML 2024 Scalable Pre-Training of Large Autoregressive Image Models Alaaeldin El-Nouby, Michal Klein, Shuangfei Zhai, Miguel Ángel Bautista, Vaishaal Shankar, Alexander T Toshev, Joshua M. Susskind, Armand Joulin
ICLR 2024 TiC-CLIP: Continual Training of CLIP Models Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri
NeurIPSW 2024 TiC-LM: A Multi-Year Benchmark for Continual Pretraining of Language Models Jeffrey Li, Mohammadreza Armandpour, Seyed Iman Mirzadeh, Sachin Mehta, Vaishaal Shankar, Raviteja Vemulapalli, Oncel Tuzel, Mehrdad Farajtabar, Hadi Pouransari, Fartash Faghri
NeurIPSW 2023 Data Filtering Networks Alex Fang, Albin Madappally Jose, Amit Jain, Ludwig Schmidt, Alexander T Toshev, Vaishaal Shankar
NeurIPS 2023 DataComp: In Search of the Next Generation of Multimodal Datasets Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei W Koh, Olga Saukh, Alexander J Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
CVPR 2023 Masked Autoencoding Does Not Help Natural Language Supervision at Scale Floris Weers, Vaishaal Shankar, Angelos Katharopoulos, Yinfei Yang, Tom Gunter
NeurIPSW 2023 Pre-Trained Language Models Do Not Help Auto-Regressive Text-to-Image Generation Yuhui Zhang, Brandon McKinzie, Zhe Gan, Vaishaal Shankar, Alexander Toshev
ICML 2023 Robustness in Multimodal Learning Under Train-Test Modality Mismatch Brandon Mckinzie, Vaishaal Shankar, Joseph Yitan Cheng, Yinfei Yang, Jonathon Shlens, Alexander T Toshev
NeurIPSW 2023 TiC-CLIP: Continual Training of CLIP Models Saurabh Garg, Mehrdad Farajtabar, Hadi Pouransari, Raviteja Vemulapalli, Sachin Mehta, Oncel Tuzel, Vaishaal Shankar, Fartash Faghri
ICML 2022 Data Determines Distributional Robustness in Contrastive Language Image Pre-Training (CLIP) Alex Fang, Gabriel Ilharco, Mitchell Wortsman, Yuhao Wan, Vaishaal Shankar, Achal Dave, Ludwig Schmidt
ICML 2021 Accuracy on the Line: On the Strong Correlation Between Out-of-Distribution and In-Distribution Generalization John P Miller, Rohan Taori, Aditi Raghunathan, Shiori Sagawa, Pang Wei Koh, Vaishaal Shankar, Percy Liang, Yair Carmon, Ludwig Schmidt
ICCV 2021 Do Image Classifiers Generalize Across Time? Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
NeurIPSW 2021 Do ImageNet Classifiers Generalize to ImageNet? Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar
NeurIPSW 2021 Evaluating Machine Accuracy on ImageNet Vaishaal Shankar, Rebecca Roelofs, Horia Mania, Alex Fang, Benjamin Recht, Ludwig Schmidt
NeurIPSW 2021 Measuring Robustness to Natural Distribution Shifts in Image Classification Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
ICCV 2021 Predicting with Confidence on Unseen Distributions Devin Guillory, Vaishaal Shankar, Sayna Ebrahimi, Trevor Darrell, Ludwig Schmidt
ICML 2020 Evaluating Machine Accuracy on ImageNet Vaishaal Shankar, Rebecca Roelofs, Horia Mania, Alex Fang, Benjamin Recht, Ludwig Schmidt
NeurIPS 2020 Measuring Robustness to Natural Distribution Shifts in Image Classification Rohan Taori, Achal Dave, Vaishaal Shankar, Nicholas Carlini, Benjamin Recht, Ludwig Schmidt
ICML 2020 Neural Kernels Without Tangents Vaishaal Shankar, Alex Fang, Wenshuo Guo, Sara Fridovich-Keil, Jonathan Ragan-Kelley, Ludwig Schmidt, Benjamin Recht
NeurIPS 2019 A Meta-Analysis of Overfitting in Machine Learning Rebecca Roelofs, Vaishaal Shankar, Benjamin Recht, Sara Fridovich-Keil, Moritz Hardt, John Miller, Ludwig Schmidt
ICMLW 2019 A Systematic Framework for Natural Perturbations from Videos Vaishaal Shankar, Achal Dave, Rebecca Roelofs, Deva Ramanan, Benjamin Recht, Ludwig Schmidt
ICML 2019 Do ImageNet Classifiers Generalize to ImageNet? Benjamin Recht, Rebecca Roelofs, Ludwig Schmidt, Vaishaal Shankar