Lu, Yao
74 publications
CVPR
2025
CoT-VLA: Visual Chain-of-Thought Reasoning for Vision-Language-Action Models
Qingqing Zhao, Yao Lu, Moo Jin Kim, Zipeng Fu, Zhuoyang Zhang, Yecheng Wu, Zhaoshuo Li, Qianli Ma, Song Han, Chelsea Finn, Ankur Handa, Tsung-Yi Lin, Gordon Wetzstein, Ming-Yu Liu, Donglai Xiang ICLR
2025
LongVILA: Scaling Long-Context Visual Language Models for Long Videos
Yukang Chen, Fuzhao Xue, Dacheng Li, Qinghao Hu, Ligeng Zhu, Xiuyu Li, Yunhao Fang, Haotian Tang, Shang Yang, Zhijian Liu, Yihui He, Hongxu Yin, Pavlo Molchanov, Jan Kautz, Linxi Fan, Yuke Zhu, Yao Lu, Song Han CVPRW
2025
NTIRE 2025 Challenge on Light Field Image Super-Resolution: Methods and Results
Yingqian Wang, Zhengyu Liang, Fengyuan Zhang, Lvli Tian, Longguang Wang, Juncheng Li, Jungang Yang, Radu Timofte, Yulan Guo, Kai Jin, Zeqiang Wei, Angulia Yang, Di Wu, Mingzhi Gao, Xiuzhuang Zhou, Yue Yan, Yuaho Wang, Shuang Chen, Zeping Tian, Yizhi Hu, Yao Lu, Haosong Liu, Xiancheng Zhu, Huanqiang Zeng, Jianqing Zhu, Yifan Shi, Junhui Hou, Mingyang Yu, Zhijian Wu, Dingjiang Huang, Wenli Zheng, Zekai Xu, Huiyuan Fu, Heng Zhang, Zhijuan Huang, Hongyuan Yu, Zeke Zexi Hu, Haodong Chen, Vera Yuk Ying Chung, Xiaoming Chen, Zean Chen, Yeyao Chen, Gangyi Jiang, Haiyong Xu, Ting Luo, Guanglong Liao, Danhao Zhang, Siyu Zhang, Wendong Mao, Zhongfeng Wang, Sunita Arya, Abhishek Kumar Sinha, S. Manthira Moorthi, Hao Zhang, Hao Sheng, Da Yang, Zhenglong Cui, Shuai Wang, Haotian Zhang, Xingzheng Wang, Yuanbo Huang, Jiahao Lin, Yuhang Lin, Ahmed Salem, Ebrahem Elkady, Hatem Ibrahem, Jae-Won Suh, Hyun-Soo Kang, Changguang Wu, Hao Hou, Pengpeng Li, Peng Huang, Jiangxin Dong, Jinhui Tang CVPR
2025
NVILA: Efficient Frontier Visual Language Models
Zhijian Liu, Ligeng Zhu, Baifeng Shi, Zhuoyang Zhang, Yuming Lou, Shang Yang, Haocheng Xi, Shiyi Cao, Yuxian Gu, Dacheng Li, Xiuyu Li, Haotian Tang, Yunhao Fang, Yukang Chen, Cheng-Yu Hsieh, De-An Huang, An-Chieh Cheng, Jinyi Hu, Sifei Liu, Ranjay Krishna, Pavlo Molchanov, Jan Kautz, Hongxu Yin, Song Han, Yao Lu ICLR
2025
SANA: Efficient High-Resolution Text-to-Image Synthesis with Linear Diffusion Transformers
Enze Xie, Junsong Chen, Junyu Chen, Han Cai, Haotian Tang, Yujun Lin, Zhekai Zhang, Muyang Li, Ligeng Zhu, Yao Lu, Song Han NeurIPS
2025
Scaling RL to Long Videos
Yukang Chen, Wei Huang, Baifeng Shi, Qinghao Hu, Hanrong Ye, Ligeng Zhu, Zhijian Liu, Pavlo Molchanov, Jan Kautz, Xiaojuan Qi, Sifei Liu, Hongxu Yin, Yao Lu, Song Han CVPR
2025
Scaling Vision Pre-Training to 4k Resolution
Baifeng Shi, Boyi Li, Han Cai, Yao Lu, Sifei Liu, Marco Pavone, Jan Kautz, Song Han, Trevor Darrell, Pavlo Molchanov, Hongxu Yin CVPR
2025
VILA-M3: Enhancing Vision-Language Models with Medical Expert Knowledge
Vishwesh Nath, Wenqi Li, Dong Yang, Andriy Myronenko, Mingxin Zheng, Yao Lu, Zhijian Liu, Hongxu Yin, Yee Man Law, Yucheng Tang, Pengfei Guo, Can Zhao, Ziyue Xu, Yufan He, Stephanie Harmon, Benjamin Simon, Greg Heinrich, Stephen Aylward, Marc Edgar, Michael Zephyr, Pavlo Molchanov, Baris Turkbey, Holger Roth, Daguang Xu ICLR
2025
VILA-U: A Unified Foundation Model Integrating Visual Understanding and Generation
Yecheng Wu, Zhuoyang Zhang, Junyu Chen, Haotian Tang, Dacheng Li, Yunhao Fang, Ligeng Zhu, Enze Xie, Hongxu Yin, Li Yi, Song Han, Yao Lu TMLR
2025
Wolf: Dense Video Captioning with a World Summarization Framework
Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Linxi Fan, Yuke Zhu, Jan Kautz, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone NeurIPS
2025
WorldModelBench: Judging Video Generation Models as World Models
Dacheng Li, Yunhao Fang, Yukang Chen, Shuo Yang, Shiyi Cao, Justin Wong, Michael Luo, Xiaolong Wang, Hongxu Yin, Joseph E. Gonzalez, Ion Stoica, Song Han, Yao Lu CVPRW
2024
NTIRE 2024 Challenge on Light Field Image Super-Resolution: Methods and Results
Yingqian Wang, Zhengyu Liang, Qianyu Chen, Longguang Wang, Jungang Yang, Radu Timofte, Yulan Guo, Wentao Chao, Yiming Kan, Xuechun Wang, Fuqing Duan, Guanghui Wang, Wang Xia, Ziqi Wang, Yue Yan, Peiqi Xia, Shunzhou Wang, Yao Lu, Angulia Yang, Kai Jin, Zeqiang Wei, Sha Guo, Mingzhi Gao, Xiuzhuang Zhou, Zhongxin Yu, Shaofei Luo, Cheng Zhong, Shaorui Chen, Long Peng, Yuhong He, Gaosheng Liu, Huanjing Yue, Jingyu Yang, Zhengjian Yao, Jiakui Hu, Lujia Jin, Zhi-Song Liu, Chenhang He, Jun Xiao, Xiuyuan Wang, Zonglin Tian, Yifan Mao, Deyang Liu, Shizheng Li, Ping An ICLR
2024
RT-Trajectory: Robotic Task Generalization via Hindsight Trajectory Sketches
Jiayuan Gu, Sean Kirmani, Paul Wohlhart, Yao Lu, Montserrat Gonzalez Arenas, Kanishka Rao, Wenhao Yu, Chuyuan Fu, Keerthana Gopalakrishnan, Zhuo Xu, Priya Sundaresan, Peng Xu, Hao Su, Karol Hausman, Chelsea Finn, Quan Vuong, Ted Xiao NeurIPSW
2024
Wolf: Captioning Everything with a World Summarization Framework
Boyi Li, Ligeng Zhu, Ran Tian, Shuhan Tan, Yuxiao Chen, Yao Lu, Yin Cui, Sushant Veer, Max Ehrlich, Jonah Philion, Xinshuo Weng, Fuzhao Xue, Andrew Tao, Ming-Yu Liu, Sanja Fidler, Boris Ivanovic, Trevor Darrell, Jitendra Malik, Song Han, Marco Pavone NeurIPS
2023
Grounded Decoding: Guiding Text Generation with Grounded Models for Embodied Agents
Wenlong Huang, Fei Xia, Dhruv Shah, Danny Driess, Andy Zeng, Yao Lu, Pete Florence, Igor Mordatch, Sergey Levine, Karol Hausman, Brian Ichter ICML
2023
Jump-Start Reinforcement Learning
Ikechukwu Uchendu, Ted Xiao, Yao Lu, Banghua Zhu, Mengyuan Yan, Joséphine Simon, Matthew Bennice, Chuyuan Fu, Cong Ma, Jiantao Jiao, Sergey Levine, Karol Hausman CVPRW
2023
NTIRE 2023 Challenge on Light Field Image Super-Resolution: Dataset, Methods and Results
Yingqian Wang, Longguang Wang, Zhengyu Liang, Jungang Yang, Radu Timofte, Yulan Guo, Kai Jin, Zeqiang Wei, Angulia Yang, Sha Guo, Mingzhi Gao, Xiuzhuang Zhou, Vinh Van Duong, Thuc Nguyen Huu, Jonghoon Yim, Byeungwoo Jeon, Yutong Liu, Zhen Cheng, Zeyu Xiao, Ruikang Xu, Zhiwei Xiong, Gaosheng Liu, Manchang Jin, Huanjing Yue, Jingyu Yang, Chen Gao, Shuo Zhang, Song Chang, Youfang Lin, Wentao Chao, Xuechun Wang, Guanghui Wang, Fuqing Duan, Wang Xia, Yan Wang, Peiqi Xia, Shunzhou Wang, Yao Lu, Ruixuan Cong, Hao Sheng, Da Yang, Rongshan Chen, Sizhe Wang, Zhenglong Cui, Yilei Chen, Yongjie Lu, Dongjun Cai, Ping An, Ahmed Salem, Hatem Ibrahem, Bilel Yagoub, Hyun Soo Kang, Zekai Zeng, Heng Wu CoRL
2023
Open-World Object Manipulation Using Pre-Trained Vision-Language Models
Austin Stone, Ted Xiao, Yao Lu, Keerthana Gopalakrishnan, Kuang-Huei Lee, Quan Vuong, Paul Wohlhart, Sean Kirmani, Brianna Zitkovich, Fei Xia, Chelsea Finn, Karol Hausman CoRL
2023
Q-Transformer: Scalable Offline Reinforcement Learning via Autoregressive Q-Functions
Yevgen Chebotar, Quan Vuong, Karol Hausman, Fei Xia, Yao Lu, Alex Irpan, Aviral Kumar, Tianhe Yu, Alexander Herzog, Karl Pertsch, Keerthana Gopalakrishnan, Julian Ibarz, Ofir Nachum, Sumedh Anand Sontakke, Grecia Salazar, Huong T. Tran, Jodilyn Peralta, Clayton Tan, Deeksha Manjunath, Jaspiar Singh, Brianna Zitkovich, Tomas Jackson, Kanishka Rao, Chelsea Finn, Sergey Levine CoRL
2023
RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control
Brianna Zitkovich, Tianhe Yu, Sichun Xu, Peng Xu, Ted Xiao, Fei Xia, Jialin Wu, Paul Wohlhart, Stefan Welker, Ayzaan Wahid, Quan Vuong, Vincent Vanhoucke, Huong Tran, Radu Soricut, Anikait Singh, Jaspiar Singh, Pierre Sermanet, Pannag R. Sanketi, Grecia Salazar, Michael S. Ryoo, Krista Reymann, Kanishka Rao, Karl Pertsch, Igor Mordatch, Henryk Michalewski, Yao Lu, Sergey Levine, Lisa Lee, Tsang-Wei Edward Lee, Isabel Leal, Yuheng Kuang, Dmitry Kalashnikov, Ryan Julian, Nikhil J. Joshi, Alex Irpan, Brian Ichter, Jasmine Hsu, Alexander Herzog, Karol Hausman, Keerthana Gopalakrishnan, Chuyuan Fu, Pete Florence, Chelsea Finn, Kumar Avinava Dubey, Danny Driess, Tianli Ding, Krzysztof Marcin Choromanski, Xi Chen, Yevgen Chebotar, Justice Carbajal, Noah Brown, Anthony Brohan, Montserrat Gonzalez Arenas, Kehang Han NeurIPSW
2023
RoboVQA: Multimodal Long-Horizon Reasoningfor Robotics
Pierre Sermanet, Tianli Ding, Jeffrey Zhao, Fei Xia, Debidatta Dwibedi, Keerthana Gopalakrishnan, Christine Chan, Gabriel Dulac-Arnold, Sharath Maddineni, Nikhil Joshi, Pete Florence, Wei Han, Robert Baruch, Yao Lu, Suvir Mirchandani, Peng Xu, Pannag Sanketi, Karol Hausman, Izhak Shafran, Brian Ichter, Yuan Cao NeurIPS
2023
Waymax: An Accelerated, Data-Driven Simulator for Large-Scale Autonomous Driving Research
Cole Gulino, Justin Fu, Wenjie Luo, George Tucker, Eli Bronstein, Yiren Lu, Jean Harb, Xinlei Pan, Yan Wang, Xiangyu Chen, John Co-Reyes, Rishabh Agarwal, Rebecca Roelofs, Yao Lu, Nico Montali, Paul Mougin, Zoey Yang, Brandyn White, Aleksandra Faust, Rowan McAllister, Dragomir Anguelov, Benjamin Sapp CoRL
2022
Do as I Can, Not as I Say: Grounding Language in Robotic Affordances
Brian Ichter, Anthony Brohan, Yevgen Chebotar, Chelsea Finn, Karol Hausman, Alexander Herzog, Daniel Ho, Julian Ibarz, Alex Irpan, Eric Jang, Ryan Julian, Dmitry Kalashnikov, Sergey Levine, Yao Lu, Carolina Parada, Kanishka Rao, Pierre Sermanet, Alexander T Toshev, Vincent Vanhoucke, Fei Xia, Ted Xiao, Peng Xu, Mengyuan Yan, Noah Brown, Michael Ahn, Omar Cortes, Nicolas Sievers, Clayton Tan, Sichun Xu, Diego Reyes, Jarek Rettinghouse, Jornell Quiambao, Peter Pastor, Linda Luu, Kuang-Huei Lee, Yuheng Kuang, Sally Jesmonth, Nikhil J. Joshi, Kyle Jeffrey, Rosario Jauregui Ruano, Jasmine Hsu, Keerthana Gopalakrishnan, Byron David, Andy Zeng, Chuyuan Kelly Fu CoRL
2021
AW-Opt: Learning Robotic Skills with Imitation andReinforcement at Scale
Yao Lu, Karol Hausman, Yevgen Chebotar, Mengyuan Yan, Eric Jang, Alexander Herzog, Ted Xiao, Alex Irpan, Mohi Khansari, Dmitry Kalashnikov, Sergey Levine ICML
2021
Actionable Models: Unsupervised Offline Reinforcement Learning of Robotic Skills
Yevgen Chebotar, Karol Hausman, Yao Lu, Ted Xiao, Dmitry Kalashnikov, Jacob Varley, Alex Irpan, Benjamin Eysenbach, Ryan C Julian, Chelsea Finn, Sergey Levine