Ryoo, Michael S.

65 publications

TMLR 2026 SMILE: A Composite Lexical-Semantic Metric for Question-Answering Evaluation Shrikant Kendre, Austin Xu, Honglu Zhou, Michael S Ryoo, Shafiq Joty, Juan Carlos Niebles
ICCV 2025 Adaptive Caching for Faster Video Generation with Diffusion Transformers Kumara Kahatapitiya, Haozhe Liu, Sen He, Ding Liu, Menglin Jia, Chenyang Zhang, Michael S. Ryoo, Tian Xie
ICLR 2025 LLaRA: Supercharging Robot Learning Data for Vision-Language Policy Xiang Li, Cristina Mata, Jongwoo Park, Kumara Kahatapitiya, Yoo Sung Jang, Jinghuan Shang, Kanchana Ranasinghe, Ryan D Burgert, Mu Cai, Yong Jae Lee, Michael S Ryoo
ICLR 2025 Understanding Long Videos with Multimodal Language Models Kanchana Ranasinghe, Xiang Li, Kumara Kahatapitiya, Michael S Ryoo
ICLRW 2025 VLM Q-Learning: Aligning Vision-Language Models for Interactive Decision-Making Jake Grigsby, Yuke Zhu, Michael S Ryoo, Juan Carlos Niebles
ECCV 2024 CoPT: Unsupervised Domain Adaptive Segmentation Using Domain-Agnostic Text Embeddings Cristina Mata, Kanchana N Ranasinghe, Michael S Ryoo
WACV 2024 Grafting Vision Transformers Jongwoo Park, Kumara Kahatapitiya, Donghyun Kim, Shivchander Sudalairaj, Quanfu Fan, Michael S. Ryoo
ECCVW 2024 Image Translation with Kernel Prediction Networks for Semantic Segmentation Cristina Mata, Michael S. Ryoo, Henrik Turbell
NeurIPSW 2024 Language Repository for Long Video Understanding Kumara Kahatapitiya, Kanchana Ranasinghe, Jongwoo Park, Michael S Ryoo
CVPR 2024 Learning to Localize Objects Improves Spatial Reasoning in Visual-LLMs Kanchana Ranasinghe, Satya Narayan Shukla, Omid Poursaeed, Michael S. Ryoo, Tsung-Yu Lin
WACV 2024 Limited Data, Unlimited Potential: A Study on ViTs Augmented by Masked Autoencoders Srijan Das, Tanmay Jain, Dominick Reilly, Pranav Balaji, Soumyajit Karmakar, Shyam Marjit, Xiang Li, Abhijit Das, Michael S. Ryoo
CVPR 2024 MAGICK: A Large-Scale Captioned Dataset from Matting Generated Images Using Chroma Keying Ryan D. Burgert, Brian L. Price, Jason Kuen, Yijun Li, Michael S. Ryoo
CVPR 2024 Mirasol3B: A Multimodal Autoregressive Model for Time-Aligned and Contextual Modalities Aj Piergiovanni, Isaac Noble, Dahun Kim, Michael S. Ryoo, Victor Gomes, Anelia Angelova
NeurIPSW 2024 Too Many Frames, Not All Useful: Efficient Strategies for Long-Form Video QA Jongwoo Park, Kanchana Ranasinghe, Kumara Kahatapitiya, Wonjeong Ryu, Donghyun Kim, Michael S Ryoo
CVPR 2024 VicTR: Video-Conditioned Text Representations for Activity Recognition Kumara Kahatapitiya, Anurag Arnab, Arsha Nagrani, Michael S. Ryoo
ECCVW 2024 xGen-VideoSyn-1: High-Fidelity Text-to-Video Synthesis with Compressed Representations Can Qin, Congying Xia, Krithika Ramakrishnan, Michael S. Ryoo, Lifu Tu, Yihao Feng, Manli Shu, Honglu Zhou, Anas Awadalla, Jun Wang, Senthil Purushwalkam, Le Xue, Yingbo Zhou, Huan Wang, Silvio Savarese, Juan Carlos Niebles, Zeyuan Chen, Ran Xu, Caiming Xiong
NeurIPS 2023 Active Vision Reinforcement Learning Under Limited Visual Observability Jinghuan Shang, Michael S Ryoo
NeurIPS 2023 Language-Based Action Concept Spaces Improve Video Self-Supervised Learning Kanchana Ranasinghe, Michael S Ryoo
CoRL 2023 RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control Brianna Zitkovich, Tianhe Yu, Sichun Xu, Peng Xu, Ted Xiao, Fei Xia, Jialin Wu, Paul Wohlhart, Stefan Welker, Ayzaan Wahid, Quan Vuong, Vincent Vanhoucke, Huong Tran, Radu Soricut, Anikait Singh, Jaspiar Singh, Pierre Sermanet, Pannag R. Sanketi, Grecia Salazar, Michael S. Ryoo, Krista Reymann, Kanishka Rao, Karl Pertsch, Igor Mordatch, Henryk Michalewski, Yao Lu, Sergey Levine, Lisa Lee, Tsang-Wei Edward Lee, Isabel Leal, Yuheng Kuang, Dmitry Kalashnikov, Ryan Julian, Nikhil J. Joshi, Alex Irpan, Brian Ichter, Jasmine Hsu, Alexander Herzog, Karol Hausman, Keerthana Gopalakrishnan, Chuyuan Fu, Pete Florence, Chelsea Finn, Kumar Avinava Dubey, Danny Driess, Tianli Ding, Krzysztof Marcin Choromanski, Xi Chen, Yevgen Chebotar, Justice Carbajal, Noah Brown, Anthony Brohan, Montserrat Gonzalez Arenas, Kehang Han
IJCAI 2023 SWAT: Spatial Structure Within and Among Tokens Kumara Kahatapitiya, Michael S. Ryoo
ICLR 2023 Socratic Models: Composing Zero-Shot Multimodal Reasoning with Language Andy Zeng, Maria Attarian, Brian Ichter, Krzysztof Marcin Choromanski, Adrian Wong, Stefan Welker, Federico Tombari, Aveek Purohit, Michael S Ryoo, Vikas Sindhwani, Johnny Lee, Vincent Vanhoucke, Pete Florence
CVPR 2023 Token Turing Machines Michael S. Ryoo, Keerthana Gopalakrishnan, Kumara Kahatapitiya, Ted Xiao, Kanishka Rao, Austin Stone, Yao Lu, Julian Ibarz, Anurag Arnab
WACV 2023 ViewCLR: Learning Self-Supervised Video Representation for Unseen Viewpoints Srijan Das, Michael S. Ryoo
AAAI 2023 Weakly-Guided Self-Supervised Pretraining for Temporal Activity Detection Kumara Kahatapitiya, Zhou Ren, Haoxiang Li, Zhenyu Wu, Michael S. Ryoo, Gang Hua
ICLR 2022 Hybrid Random Features Krzysztof Marcin Choromanski, Han Lin, Haoxian Chen, Arijit Sehanobish, Yuanzhe Ma, Deepali Jain, Jake Varley, Andy Zeng, Michael S Ryoo, Valerii Likhosherstov, Dmitry Kalashnikov, Vikas Sindhwani, Adrian Weller
CVPR 2022 MS-TCT: Multi-Scale Temporal ConvTransformer for Action Detection Rui Dai, Srijan Das, Kumara Kahatapitiya, Michael S. Ryoo, François Brémond
CVPR 2022 Self-Supervised Video Transformer Kanchana Ranasinghe, Muzammal Naseer, Salman Khan, Fahad Shahbaz Khan, Michael S. Ryoo
ECCV 2022 StARformer: Transformer with State-Action-Reward Representations for Visual Reinforcement Learning Jinghuan Shang, Kumara Kahatapitiya, Xiang Li, Michael S. Ryoo
CoRL 2022 TRITON: Neural Neural Textures for Better Sim2Real Ryan D. Burgert, Jinghuan Shang, Xiang Li, Michael S. Ryoo
ECCV 2022 Video Question Answering with Iterative Video-Text Co-Tokenization Aj Piergiovanni, Kairo Morton, Weicheng Kuo, Michael S. Ryoo, Anelia Angelova
ICCV 2021 4D-Net for Learned Multi-Modal Alignment Aj Piergiovanni, Vincent Casser, Michael S. Ryoo, Anelia Angelova
CVPRW 2021 Adaptive Intermediate Representations for Video Understanding Juhana Kangaspunta, A. J. Piergiovanni, Rico Jonschkowski, Michael S. Ryoo, Anelia Angelova
CVPR 2021 Coarse-Fine Networks for Temporal Activity Detection in Videos Kumara Kahatapitiya, Michael S. Ryoo
CVPR 2021 Recognizing Actions in Videos from Unseen Viewpoints Aj Piergiovanni, Michael S. Ryoo
ECCV 2020 Adversarial Generative Grammars for Human Activity Prediction Aj Piergiovanni, Anelia Angelova, Alexander Toshev, Michael S. Ryoo
ECCV 2020 AssembleNet++: Assembling Modality Representations via Attention Connections - Supplementary Material - Michael S. Ryoo, Aj Piergiovanni, Juhana Kangaspunta, Anelia Angelova
ICLR 2020 AssembleNet: Searching for Multi-Stream Neural Connectivity in Video Architectures Michael S. Ryoo, Aj Piergiovanni, Mingxing Tan, Anelia Angelova
ECCV 2020 AttentionNAS: Spatiotemporal Attention Cell Search for Video Classification Xiaofang Wang, Xuehan Xiong, Maxim Neumann, Aj Piergiovanni, Michael S. Ryoo, Anelia Angelova, Kris M. Kitani, Wei Hua
AAAI 2020 Differentiable Grammars for Videos A. J. Piergiovanni, Anelia Angelova, Michael S. Ryoo
ECCV 2020 Password-Conditioned Anonymization and Deanonymization with Face Identity Transformers Xiuye Gu, Weixin Luo, Michael S. Ryoo, Yong Jae Lee
CVPRW 2019 Early Detection of Injuries in MLB Pitchers from Video A. J. Piergiovanni, Michael S. Ryoo
CoRL 2019 Model-Based Behavioral Cloning with Future Image Similarity Learning Alan Wu, Aj Piergiovanni, Michael S. Ryoo
CVPRW 2018 Action-Conditioned Convolutional Future Regression Models for Robot Imitation Learning Alan Wu, A. J. Piergiovanni, Michael S. Ryoo
AAAI 2018 Extreme Low Resolution Activity Recognition with Multi-Siamese Embedding Learning Michael S. Ryoo, Kiyoon Kim, Hyun Jong Yang
CVPRW 2018 Fine-Grained Activity Recognition in Baseball Videos A. J. Piergiovanni, Michael S. Ryoo
ECCVW 2018 Forecasting Hands and Objects in Future Frames Chenyou Fan, Jangwon Lee, Michael S. Ryoo
ECCV 2018 Joint Person Segmentation and Identification in Synchronized First- and Third-Person Videos Mingze Xu, Chenyou Fan, Yuchen Wang, Michael S. Ryoo, David J. Crandall
ECCV 2018 Learning to Anonymize Faces for Privacy Preserving Action Detection Zhongzheng Ren, Yong Jae Lee, Michael S. Ryoo
CVPR 2017 Identifying First-Person Camera Wearers in Third-Person Videos Chenyou Fan, Jangwon Lee, Mingze Xu, Krishna Kumar Singh, Yong Jae Lee, David J. Crandall, Michael S. Ryoo
CVPRW 2017 Learning Robot Activities from First-Person Human Videos Using Convolutional Future Regression Jangwon Lee, Michael S. Ryoo
IJCAI 2017 Multi-Type Activity Recognition from a Robot's Viewpoint Ilaria Gori, J. K. Aggarwal, Larry H. Matthies, Michael S. Ryoo
AAAI 2017 Privacy-Preserving Human Activity Recognition from Extreme Low Resolution Michael S. Ryoo, Brandon Rothrock, Charles Fleming, Hyun Jong Yang
AAAI 2017 Title Learning Latent Subevents in Activity Videos Using Temporal Attention Filters A. J. Piergiovanni, Chenyou Fan, Michael S. Ryoo
IJCAI 2016 Learning Social Affordance for Human-Robot Interaction Tianmin Shu, Michael S. Ryoo, Song-Chun Zhu
CVPR 2015 Pooled Motion Features for First-Person Videos Michael S. Ryoo, Brandon Rothrock, Larry Matthies
WACV 2015 Robot-Centric Activity Recognition from First-Person RGB-D Videos Lu Xia, Ilaria Gori, Jake K. Aggarwal, Michael S. Ryoo
CVPRW 2014 An Introduction to the 3rd Workshop on Egocentric (First-Person) Vision Steve Mann, Kris M. Kitani, Yong Jae Lee, Michael S. Ryoo, Alireza Fathi
CVPR 2013 First-Person Activity Recognition: What Are They Doing to Me? Michael S. Ryoo, Larry Matthies
ICCV 2011 Human Activity Prediction: Early Recognition of Ongoing Activities from Streaming Videos Michael S. Ryoo
ICCV 2009 Spatio-Temporal Relationship Match: Video Structure Comparison for Recognition of Complex Human Activities Michael S. Ryoo, Jake K. Aggarwal
CVPRW 2009 Stochastic Representation and Recognition of High-Level Group Activities: Describing Structural Uncertainties in Human Activities Michael S. Ryoo, Jake K. Aggarwal
CVPR 2008 Observe-and-Explain: A New Approach for Multiple Hypotheses Tracking of Humans and Objects Michael S. Ryoo, Jake K. Aggarwal
CVPR 2007 Hierarchical Recognition of Human Activities Interacting with Objects Michael S. Ryoo, J. K. Aggarwal
IJCAI 2007 Robust Human-Computer Interaction System Guiding a User by Providing Feedback Michael S. Ryoo, Jake K. Aggarwal
CVPR 2006 Recognition of Composite Human Activities Through Context-Free Grammar Based Representation Michael S. Ryoo, J. K. Aggarwal