Shao, Rui

21 publications

ICCV 2025 Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation Jiaer Xia, Bingkui Tong, Yuhang Zang, Rui Shao, Kaiyang Zhou
NeurIPS 2025 CogVLA: Cognition-Aligned Vision-Language-Action Models via Instruction-Driven Routing & Sparsification Wei Li, Renshan Zhang, Rui Shao, Jie He, Liqiang Nie
ICCV 2025 FALCON: Resolving Visual Redundancy and Fragmentation in High-Resolution Multimodal Large Language Models via Visual Registers Renshan Zhang, Rui Shao, Gongwei Chen, Miao Zhang, Kaiwen Zhou, Weili Guan, Liqiang Nie
IJCAI 2025 Incorporating Legal Logic into Deep Learning: An Intelligent Approach to Probation Prediction Qinghua Wang, Xu Zhang, Lingyan Yang, Rui Shao, Bonan Wang, Fang Wang, Cunquan Qu
CVPR 2025 LION-FS: Fast & Slow Video-Language Thinker as Online Video Assistant Wei Li, Bing Hu, Rui Shao, Leyang Shen, Liqiang Nie
ICCV 2025 Less Is More: Empowering GUI Agent with Context-Aware Simplification Gongwei Chen, Xurui Zhou, Rui Shao, Yibo Lyu, Kaiwen Zhou, Shuai Wang, Wentao Li, Yinchuan Li, Zhongang Qi, Liqiang Nie
CVPR 2025 Optimus-2: Multimodal Minecraft Agent with Goal-Observation-Action Conditioned Policy Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie
NeurIPS 2025 PUO-Bench: A Panel Understanding and Operation Benchmark with a Privacy-Preserving Framework Wei Lin, Yiwei Zhou, Junkai Zhang, Rui Shao, Zhiyuan Zhao, Junyu Gao, Antoni B. Chan, Xuelong Li
ICML 2025 STAR: Learning Diverse Robot Skill Abstractions Through Rotation-Augmented Vector Quantization Hao Li, Qi Lv, Rui Shao, Xiang Deng, Yinchuan Li, Jianye Hao, Liqiang Nie
ICLR 2025 Spa-Bench: A Comprehensive Benchmark for Smartphone Agent Evaluation Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Kaiwen Zhou, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao
CVPR 2025 Spatial-Temporal Graph Diffusion Policy with Kinematic Modeling for Bimanual Robotic Manipulation Qi Lv, Hao Li, Xiang Deng, Rui Shao, Yinchuan Li, Jianye Hao, Longxiang Gao, Michael Yu Wang, Liqiang Nie
ECCV 2024 CAT: Enhancing Multimodal Large Language Model to Answer Questions in Dynamic Audio-Visual Scenarios Qilang Ye, Zitong Yu, Rui Shao, Xinyu Xie, Philip Torr, Xiaochun Cao
CVPR 2024 LION: Empowering Multimodal Large Language Model with Dual-Level Visual Knowledge Gongwei Chen, Leyang Shen, Rui Shao, Xiang Deng, Liqiang Nie
NeurIPS 2024 MoME: Mixture of Multimodal Experts for Generalist Multimodal Large Language Models Leyang Shen, Gongwei Chen, Rui Shao, Weili Guan, Liqiang Nie
NeurIPS 2024 Optimus-1: Hybrid Multimodal Memory Empowered Agents Excel in Long-Horizon Tasks Zaijing Li, Yuquan Xie, Rui Shao, Gongwei Chen, Dongmei Jiang, Liqiang Nie
ICML 2024 RoboMP$^2$: A Robotic Multimodal Perception-Planning Framework with Multimodal Large Language Models Qi Lv, Hao Li, Xiang Deng, Rui Shao, Michael Y Wang, Liqiang Nie
NeurIPSW 2024 Spa-Bench: A Comprehensive Benchmark for Smartphone Agent Evaluation Jingxuan Chen, Derek Yuen, Bin Xie, Yuhao Yang, Gongwei Chen, Zhihao Wu, Li Yixing, Xurui Zhou, Weiwen Liu, Shuai Wang, Rui Shao, Liqiang Nie, Yasheng Wang, Jianye Hao, Jun Wang, Kun Shao
CVPR 2023 Detecting and Grounding Multi-Modal Media Manipulation Rui Shao, Tianxing Wu, Ziwei Liu
ECCV 2022 Detecting and Recovering Sequential DeepFake Manipulation Rui Shao, Tianxing Wu, Ziwei Liu
ECCV 2020 Open-Set Adversarial Defense Rui Shao, Pramuditha Perera, Pong C. Yuen, Vishal M. Patel
AAAI 2020 Regularized Fine-Grained Meta Face Anti-Spoofing Rui Shao, Xiangyuan Lan, Pong C. Yuen