Krishna, Ranjay

76 publications

ICLR 2025 AHA: A Vision-Language-Model for Detecting and Reasoning over Failures in Robotic Manipulation Jiafei Duan, Wilbert Pumacay, Nishanth Kumar, Yi Ru Wang, Shulin Tian, Wentao Yuan, Ranjay Krishna, Dieter Fox, Ajay Mandlekar, Yijie Guo
CVPR 2025 Coarse Correspondences Boost Spatial-Temporal Reasoning in Multimodal Language Model Benlin Liu, Yuhao Dong, Yiqin Wang, Zixian Ma, Yansong Tang, Luming Tang, Yongming Rao, Wei-Chiu Ma, Ranjay Krishna
ICCV 2025 Contrastive Flow Matching George Stoica, Vivek Ramanujan, Xiang Fan, Ali Farhadi, Ranjay Krishna, Judy Hoffman
NeurIPS 2025 Convergent Functions, Divergent Forms Hyeonseong Jeon, Ainaz Eftekhar, Aaron Walsman, Kuo-Hao Zeng, Ali Farhadi, Ranjay Krishna
CVPR 2025 Eval3D: Interpretable and Fine-Grained Evaluation for 3D Generation Shivam Duggal, Yushi Hu, Oscar Michel, Aniruddha Kembhavi, William T. Freeman, Noah A. Smith, Ranjay Krishna, Antonio Torralba, Ali Farhadi, Wei-Chiu Ma
ICLRW 2025 FocalLens: Instruction Tuning Enables Zero-Shot Conditional Image Representations Cheng-Yu Hsieh, Pavan Kumar Anasosalu Vasu, Fartash Faghri, Raviteja Vemulapalli, Chun-Liang Li, Ranjay Krishna, Oncel Tuzel, Hadi Pouransari
CoRL 2025 GraspMolmo: Generalizable Task-Oriented Grasping via Large-Scale Synthetic Data Generation Abhay Deshpande, Yuquan Deng, Jordi Salvador, Arijit Ray, Winson Han, Jiafei Duan, Rose Hendrix, Yuke Zhu, Ranjay Krishna
ICLR 2025 Interleaved Scene Graphs for Interleaved Text-and-Image Generation Assessment Dongping Chen, Ruoxi Chen, Shu Pu, Zhaoyi Liu, Yanru Wu, Caixi Chen, Benlin Liu, Yue Huang, Yao Wan, Pan Zhou, Ranjay Krishna
ICLRW 2025 Language Model Preference Evaluation with Multiple Weak Evaluators Zhengyu Hu, Jieyu Zhang, Zhihan Xiong, Alexander Ratner, Hui Xiong, Ranjay Krishna
CoRL 2025 ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training Ge Yan, Jiyue Zhu, Yuquan Deng, Shiqi Yang, Ri-Zhao Qiu, Xuxin Cheng, Marius Memmel, Ranjay Krishna, Ankit Goyal, Xiaolong Wang, Dieter Fox
CVPR 2025 Molmo and PixMo: Open Weights and Open Data for State-of-the-Art Vision-Language Models Matt Deitke, Christopher Clark, Sangho Lee, Rohun Tripathi, Yue Yang, Jae Sung Park, Mohammadreza Salehi, Niklas Muennighoff, Kyle Lo, Luca Soldaini, Jiasen Lu, Taira Anderson, Erin Bransom, Kiana Ehsani, Huong Ngo, YenSung Chen, Ajay Patel, Mark Yatskar, Chris Callison-Burch, Andrew Head, Rose Hendrix, Favyen Bastani, Eli VanderBilt, Nathan Lambert, Yvonne Chou, Arnavi Chheda, Jenna Sparks, Sam Skjonsberg, Michael Schmitz, Aaron Sarnat, Byron Bischoff, Pete Walsh, Chris Newell, Piper Wolters, Tanmay Gupta, Kuo-Hao Zeng, Jon Borchardt, Dirk Groeneveld, Crystal Nam, Sophie Lebrecht, Caitlin Wittlif, Carissa Schoenick, Oscar Michel, Ranjay Krishna, Luca Weihs, Noah A. Smith, Hannaneh Hajishirzi, Ross Girshick, Ali Farhadi, Aniruddha Kembhavi
CVPR 2025 NVILA: Efficient Frontier Visual Language Models Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Haotian Tang, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Jinyi Hu, Sifei Liu, Ranjay Krishna, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu
CVPR 2025 One Diffusion to Generate Them All Duong H. Le, Tuan Pham, Sangho Lee, Christopher Clark, Aniruddha Kembhavi, Stephan Mandt, Ranjay Krishna, Jiasen Lu
ICCV 2025 One Trajectory, One Token: Grounded Video Tokenization via Panoptic Sub-Object Trajectory Chenhao Zheng, Jieyu Zhang, Mohammadreza Salehi, Ziqi Gao, Vishnu Iyengar, Norimasa Kobori, Quan Kong, Ranjay Krishna
ICCV 2025 PathFinder: A Multi-Modal Multi-Agent System for Medical Diagnostic Decision-Making Applied to Histopathology Fatemeh Ghezloo, Mehmet Saygin Seyfioglu, Rustin Soraki, Wisdom O. Ikezogwo, Beibin Li, Tejoram Vivekanandan, Joann G. Elmore, Ranjay Krishna, Linda Shapiro
CVPR 2025 Perception Tokens Enhance Visual Reasoning in Multimodal Language Models Mahtab Bigverdi, Zelun Luo, Cheng-Yu Hsieh, Ethan Shen, Dongping Chen, Linda G. Shapiro, Ranjay Krishna
CVPR 2025 RealEdit: Reddit Edits as a Large-Scale Empirical Dataset for Image Transformations Peter Sushko, Ayana Bharadwaj, Zhi Yang Lim, Vasily Ilin, Ben Caffee, Dongping Chen, Mohammadreza Salehi, Cheng-Yu Hsieh, Ranjay Krishna
ICML 2025 SAM2Act: Integrating Visual Foundation Model with a Memory Architecture for Robotic Manipulation Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan
ICLRW 2025 SAM2Act: Integrating Visual Foundation Model with a Memory Architecture for Robotic Manipulation Haoquan Fang, Markus Grotz, Wilbert Pumacay, Yi Ru Wang, Dieter Fox, Ranjay Krishna, Jiafei Duan
NeurIPS 2025 Seeking and Updating with Live Visual Knowledge Mingyang Fu, Yuyang Peng, Dongping Chen, Zetong Zhou, Benlin Liu, Yao Wan, Zhou Zhao, Philip S. Yu, Ranjay Krishna
CVPR 2025 Semantic and Expressive Variations in Image Captions Across Languages Andre Ye, Sebastin Santy, Jena D. Hwang, Amy X. Zhang, Ranjay Krishna
CVPR 2025 Synthetic Visual Genome Jae Sung Park, Zixian Ma, Linjie Li, Chenhao Zheng, Cheng-Yu Hsieh, Ximing Lu, Khyathi Chandu, Quan Kong, Norimasa Kobori, Ali Farhadi, Yejin Choi, Ranjay Krishna
ICLRW 2025 TACO: Learning Multi-Modal Models to Reason and Act with Synthetic Chains-of-Thought-and-Action Zixian Ma, Jianguo Zhang, Zhiwei Liu, Jieyu Zhang, Juntao Tan, Manli Shu, Juan Carlos Niebles, Shelby Heinecke, Huan Wang, Caiming Xiong, Ranjay Krishna, Silvio Savarese
ICLRW 2025 The Delta Learning Hypothesis: Preference Tuning on Weak Data Can Yield Strong Gains Scott Geng, Hamish Ivison, Chun-Liang Li, Maarten Sap, Jerry Li, Ranjay Krishna, Pang Wei Koh
NeurIPS 2025 VAGEN: Reinforcing World Model Reasoning for Multi-Turn VLM Agents Kangrui Wang, Pingyue Zhang, Zihan Wang, Yaning Gao, Linjie Li, Qineng Wang, Hanyang Chen, Yiping Lu, Zhengyuan Yang, Lijuan Wang, Ranjay Krishna, Jiajun Wu, Li Fei-Fei, Yejin Choi, Manling Li
NeurIPS 2024 ActionAtlas: A VideoQA Benchmark for Domain-Specialized Action Recognition Mohammadreza Salehi, Jae Sung Park, Tanush Yadav, Aditya Kusupati, Ranjay Krishna, Yejin Choi, Hannaneh Hajishirzi, Ali Farhadi
ECCV 2024 BLINK: Multimodal Large Language Models Can See but Not Perceive Xingyu Fu, Yushi Hu, Bangzheng Li, Yu Feng, Haoyu Wang, Xudong Lin, Dan Roth, Noah A Smith, Wei-Chiu Ma, Ranjay Krishna
ICLR 2024 Davidsonian Scene Graph: Improving Reliability in Fine-Grained Evaluation for Text-to-Image Generation Jaemin Cho, Yushi Hu, Jason Michael Baldridge, Roopal Garg, Peter Anderson, Ranjay Krishna, Mohit Bansal, Jordi Pont-Tuset, Su Wang
ICLRW 2024 EcoAssistant: Using LLM Assistants More Affordably and Accurately Jieyu Zhang, Ranjay Krishna, Ahmed Hassan Awadallah, Chi Wang
ECCV 2024 Efficient Inference of Vision Instruction-Following Models with Elastic Cache Zuyan Liu, Benlin Liu, Jiahui Wang, Yuhao Dong, Guangyi Chen, Yongming Rao, Ranjay Krishna, Jiwen Lu
CVPR 2024 Holodeck: Language Guided Generation of 3D Embodied AI Environments Yue Yang, Fan-Yun Sun, Luca Weihs, Eli VanderBilt, Alvaro Herrasti, Winson Han, Jiajun Wu, Nick Haber, Ranjay Krishna, Lingjie Liu, Chris Callison-Burch, Mark Yatskar, Aniruddha Kembhavi, Christopher Clark
CoRL 2024 I Can Tell What I Am Doing: Toward Real-World Natural Language Grounding of Robot Experiences Zihan Wang, Brian Liang, Varad Dhat, Zander Brumbaugh, Nick Walker, Ranjay Krishna, Maya Cakmak
CVPR 2024 Iterated Learning Improves Compositionality in Large Vision-Language Models Chenhao Zheng, Jieyu Zhang, Aniruddha Kembhavi, Ranjay Krishna
ECCV 2024 M&m’s: A Benchmark to Evaluate Tool-Use for Multi-Step Multi-Modal Tasks Zixian Ma, Weikai Huang, Jieyu Zhang, Tanmay Gupta, Ranjay Krishna
CVPRW 2024 MIMIC: Masked Image Modeling with Image Correspondences Kalyani Marathe, Mahtab Bigverdi, Nishat Khan, Tuhin Kundu, Patrick Howe, Sharan Ranjit S, Anand Bhattad, Aniruddha Kembhavi, Linda G. Shapiro, Ranjay Krishna
CoRL 2024 Manipulate-Anything: Automating Real-World Robots Using Vision-Language Models Jiafei Duan, Wentao Yuan, Wilbert Pumacay, Yi Ru Wang, Kiana Ehsani, Dieter Fox, Ranjay Krishna
CVPR 2024 Modeling Collaborator: Enabling Subjective Vision Classification with Minimal Human Effort via LLM Tool-Use Imad Eddine Toubal, Aditya Avinash, Neil Gordon Alldrin, Jan Dlabal, Wenlei Zhou, Enming Luo, Otilia Stretcu, Hao Xiong, Chun-Ta Lu, Howard Zhou, Ranjay Krishna, Ariel Fuxman, Tom Duerig
NeurIPS 2024 Multilingual Diversity Improves Vision-Language Representations Thao Nguyen, Matthew Wallingford, Sebastin Santy, Wei-Chiu Ma, Sewoong Oh, Ludwig Schmidt, Pang Wei Koh, Ranjay Krishna
NeurIPS 2024 NaturalBench: Evaluating Vision-Language Models on Natural Adversarial Samples Baiqi Li, Zhiqiu Lin, Wenxuan Peng, Jean de Dieu Nyandwi, Daniel Jiang, Zixian Ma, Simran Khanuja, Ranjay Krishna, Graham Neubig, Deva Ramanan
ICML 2024 Offline Training of Language Model Agents with Functions as Learnable Weights Shaokun Zhang, Jieyu Zhang, Jiale Liu, Linxin Song, Chi Wang, Ranjay Krishna, Qingyun Wu
CVPR 2024 Quilt-LLaVA: Visual Instruction Tuning by Extracting Localized Narratives from Open-Source Histopathology Videos Mehmet Saygin Seyfioglu, Wisdom O. Ikezogwo, Fatemeh Ghezloo, Ranjay Krishna, Linda Shapiro
CoRL 2024 RoboPoint: A Vision-Language Model for Spatial Affordance Prediction in Robotics Wentao Yuan, Jiafei Duan, Valts Blukis, Wilbert Pumacay, Ranjay Krishna, Adithyavairavan Murali, Arsalan Mousavian, Dieter Fox
ECCV 2024 SPARO: Selective Attention for Robust and Compositional Transformer Encodings for Vision Ankit Vani, Bac Nguyen, Samuel Lavoie, Ranjay Krishna, Aaron Courville
CVPR 2024 SPOC: Imitating Shortest Paths in Simulation Enables Effective Navigation and Manipulation in the Real World Kiana Ehsani, Tanmay Gupta, Rose Hendrix, Jordi Salvador, Luca Weihs, Kuo-Hao Zeng, Kunal Pratap Singh, Yejin Kim, Winson Han, Alvaro Herrasti, Ranjay Krishna, Dustin Schwenk, Eli VanderBilt, Aniruddha Kembhavi
ICLR 2024 Selective Visual Representations Improve Convergence and Generalization for Embodied AI Ainaz Eftekhar, Kuo-Hao Zeng, Jiafei Duan, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
NeurIPS 2024 Superposed Decoding: Multiple Generations from a Single Autoregressive Inference Pass Ethan Shen, Alan Fan, Sarah Pratt, Jae Sung Park, Matthew Wallingford, Sham Kakade, Ari Holtzman, Ranjay Krishna, Ali Farhadi, Aditya Kusupati
NeurIPS 2024 Task Me Anything Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
NeurIPSW 2024 Taskverse: A Benchmark Generation Engine for Multi-Modal Language Model Jieyu Zhang, Weikai Huang, Zixian Ma, Oscar Michel, Dong He, Tanmay Gupta, Wei-Chiu Ma, Ali Farhadi, Aniruddha Kembhavi, Ranjay Krishna
ECCV 2024 The Hard Positive Truth About Vision-Language Compositionality Amita Kamath, Cheng-Yu Hsieh, Kai-Wei Chang, Ranjay Krishna
NeurIPS 2024 The Unmet Promise of Synthetic Training Images: Using Retrieved Real Images Performs Better Scott Geng, Cheng-Yu Hsieh, Vivek Ramanujan, Matthew Wallingford, Chun-Liang Li, Pang Wei Koh, Ranjay Krishna
ECCV 2024 Videoshop: Localized Semantic Video Editing with Noise-Extrapolated Diffusion Inversion Xiang Fan, Anand Bhattad, Ranjay Krishna
CVPR 2024 Visual Program Distillation: Distilling Tools and Programmatic Reasoning into Vision-Language Models Yushi Hu, Otilia Stretcu, Chun-Ta Lu, Krishnamurthy Viswanathan, Kenji Hata, Enming Luo, Ranjay Krishna, Ariel Fuxman
NeurIPS 2024 Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Ranjay Krishna
NeurIPSW 2024 Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models Yushi Hu, Weijia Shi, Xingyu Fu, Dan Roth, Mari Ostendorf, Luke Zettlemoyer, Noah A. Smith, Ranjay Krishna
CoRL 2023 AR2-D2: Training a Robot Without a Robot Jiafei Duan, Yi Ru Wang, Mohit Shridhar, Dieter Fox, Ranjay Krishna
ICCV 2023 Agile Modeling: From Concept to Classifier in Minutes Otilia Stretcu, Edward Vendrow, Kenji Hata, Krishnamurthy Viswanathan, Vittorio Ferrari, Sasan Tavakkol, Wenlei Zhou, Aditya Avinash, Emming Luo, Neil Gordon Alldrin, MohammadHossein Bateni, Gabriel Berger, Andrew Bunner, Chun-Ta Lu, Javier Rey, Giulia DeSalvo, Ranjay Krishna, Ariel Fuxman‎
NeurIPSW 2023 Agile Modeling: From Concept to Classifier in Minutes Otilia Stretcu, Edward Vendrow, Kenji Hata, Krishnamurthy Viswanathan, Vittorio Ferrari, Sasan Tavakkol, Wenlei Zhou, Aditya Avinash, Enming Luo, Neil Gordon Alldrin, Mohammadhossein Bateni, Gabriel Berger, Andrew Bunner, Chun-Ta Lu, Javier A Rey, Giulia DeSalvo, Ranjay Krishna, Ariel Fuxman
CVPR 2023 CREPE: Can Vision-Language Foundation Models Reason Compositionally? Zixian Ma, Jerry Hong, Mustafa Omer Gul, Mona Gandhi, Irena Gao, Ranjay Krishna
NeurIPS 2023 Cola: A Benchmark for Compositional Text-to-Image Retrieval Arijit Ray, Filip Radenovic, Abhimanyu Dubey, Bryan Plummer, Ranjay Krishna, Kate Saenko
NeurIPS 2023 DataComp: In Search of the Next Generation of Multimodal Datasets Samir Yitzhak Gadre, Gabriel Ilharco, Alex Fang, Jonathan Hayase, Georgios Smyrnis, Thao Nguyen, Ryan Marten, Mitchell Wortsman, Dhruba Ghosh, Jieyu Zhang, Eyal Orgad, Rahim Entezari, Giannis Daras, Sarah Pratt, Vivek Ramanujan, Yonatan Bitton, Kalyani Marathe, Stephen Mussmann, Richard Vencu, Mehdi Cherti, Ranjay Krishna, Pang Wei W Koh, Olga Saukh, Alexander J Ratner, Shuran Song, Hannaneh Hajishirzi, Ali Farhadi, Romain Beaumont, Sewoong Oh, Alex Dimakis, Jenia Jitsev, Yair Carmon, Vaishaal Shankar, Ludwig Schmidt
NeurIPS 2023 Large Language Model as Attributed Training Data Generator: A Tale of Diversity and Bias Yue Yu, Yuchen Zhuang, Jieyu Zhang, Yu Meng, Alexander J Ratner, Ranjay Krishna, Jiaming Shen, Chao Zhang
NeurIPS 2023 OBJECT 3DIT: Language-Guided 3D-Aware Image Editing Oscar Michel, Anand Bhattad, Eli VanderBilt, Ranjay Krishna, Aniruddha Kembhavi, Tanmay Gupta
NeurIPS 2023 Quilt-1m: One Million Image-Text Pairs for Histopathology Wisdom Ikezogwo, Saygin Seyfioglu, Fatemeh Ghezloo, Dylan Geva, Fatwir Sheikh Mohammed, Pavan Kumar Anand, Ranjay Krishna, Linda G. Shapiro
NeurIPS 2023 SugarCrepe: Fixing Hackable Benchmarks for Vision-Language Compositionality Cheng-Yu Hsieh, Jieyu Zhang, Zixian Ma, Aniruddha Kembhavi, Ranjay Krishna
ICCV 2023 TIFA: Accurate and Interpretable Text-to-Image Faithfulness Evaluation with Question Answering Yushi Hu, Benlin Liu, Jungo Kasai, Yizhong Wang, Mari Ostendorf, Ranjay Krishna, Noah A. Smith
NeurIPS 2022 ELIGN: Expectation Alignment as a Multi-Agent Intrinsic Reward Zixian Ma, Rose Wang, Fei-Fei Li, Michael Bernstein, Ranjay Krishna
CVPR 2022 Measuring Compositional Consistency for Video Question Answering Mona Gandhi, Mustafa Omer Gul, Eva Prakash, Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
CVPR 2021 AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning Madeleine Grunde-McLaughlin, Ranjay Krishna, Maneesh Agrawala
ICLRW 2019 HYPE: Human-eYe Perceptual Evaluation of Generative Models Sharon Zhou, Mitchell Gordon, Ranjay Krishna, Austin Narcomey, Durim Morina, Michael S. Bernstein
NeurIPS 2019 HYPE: A Benchmark for Human eYe Perceptual Evaluation of Generative Models Sharon Zhou, Mitchell Gordon, Ranjay Krishna, Austin Narcomey, Li F Fei-Fei, Michael Bernstein
ICCVW 2019 Scene Graph Prediction with Limited Labels Vincent S. Chen, Paroma Varma, Ranjay Krishna, Michael S. Bernstein, Christopher Ré, Li Fei-Fei
ICCVW 2019 Visual Relationships as Functions: Enabling Few-Shot Scene Graph Prediction Apoorva Dornadula, Austin Narcomey, Ranjay Krishna, Michael S. Bernstein, Li Fei-Fei
CVPR 2017 A Hierarchical Approach for Generating Descriptive Image Paragraphs Jonathan Krause, Justin Johnson, Ranjay Krishna, Li Fei-Fei
ICCV 2017 Dense-Captioning Events in Videos Ranjay Krishna, Kenji Hata, Frederic Ren, Li Fei-Fei, Juan Carlos Niebles
ECCV 2016 Visual Relationship Detection with Language Priors Cewu Lu, Ranjay Krishna, Michael S. Bernstein, Li Fei-Fei
CVPR 2015 Image Retrieval Using Scene Graphs Justin Johnson, Ranjay Krishna, Michael Stark, Li-Jia Li, David Shamma, Michael Bernstein, Li Fei-Fei