Gupta, Tanmay

16 publications

CVPR 2025 Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, Aniruddha Kembhavi
ECCV 2024 M&m’s: A Benchmark to Evaluate Tool-Use for Multi-Step Multi-Modal Tasks Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
CVPR 2024 SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World Kiana Ehsani, Tanmay Gupta, Rose Hendrix, Jordi Salvador, Luca Weihs, Kuo-Hao Zeng, Kunal Pratap Singh, Yejin Kim, Winson Han, Alvaro Herrasti, Ranjay Krishna, Dustin Schwenk, Eli VanderBilt, Aniruddha Kembhavi
NeurIPS 2024 Task Me Anything Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
NeurIPSW 2024 Taskverse: A Benchmark Generation Engine for Multi-Modal Language Model Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
NeurIPS 2023 OBJECT 3DIT: Language-Guided 3D-Aware Image Editing Oscar Michel, Anand Bhattad, Eli VanderBilt, Ranjay Krishna, Aniruddha Kembhavi, Tanmay Gupta
CVPR 2023 Visual Programming: Compositional Visual Reasoning Without Training Tanmay Gupta, Aniruddha Kembhavi
ICMLW 2022 Conditional Distributional Invariance Through Implicit Regularization Tanmay Gupta
CVPR 2022 Towards General Purpose Vision Systems: An End-to-End Task-Agnostic Vision-Language Architecture Tanmay Gupta, Amita Kamath, Aniruddha Kembhavi, Derek Hoiem
ECCV 2022 Webly Supervised Concept Expansion for General Purpose Vision Models Amita Kamath, Christopher Clark, Tanmay Gupta, Eric Kolve, Derek Hoiem, Aniruddha Kembhavi
ICML 2021 Learning Curves for Analysis of Deep Networks Derek Hoiem, Tanmay Gupta, Zhizhong Li, Michal Shlapentokh-Rothman
CVPR 2021 Visual Semantic Role Labeling for Video Understanding Arka Sadhu, Tanmay Gupta, Mark Yatskar, Ram Nevatia, Aniruddha Kembhavi
ECCV 2020 Contrastive Learning for Weakly Supervised Phrase Grounding Tanmay Gupta, Arash Vahdat, Gal Chechik, Xiaodong Yang, Jan Kautz, Derek Hoiem
ECCV 2018 Imagine This! Scripts to Compositions to Videos Tanmay Gupta, Dustin Schwenk, Ali Farhadi, Derek Hoiem, Aniruddha Kembhavi
ICCV 2017 Aligned Image-Word Representations Improve Inductive Transfer Across Vision-Language Tasks Tanmay Gupta, Kevin Shih, Saurabh Singh, Derek Hoiem
CVPR 2015 Completing 3D Object Shape from One Depth Image Jason Rock, Tanmay Gupta, Justin Thorsen, JunYoung Gwak, Daeyun Shin, Derek Hoiem