MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data

Abstract

This paper introduces MobileH2R, a framework for learning generalizable vision-based human-to-mobile-robot (H2MR) handover skills. Unlike traditional fixed-base handovers, this task requires a mobile robot to reliably receive objects in a large workspace enabled by its mobility. Our key insight is that generalizable handover skills can be developed in simulators using high-quality synthetic data, without the need for real-world demonstrations. To achieve this, we propose a scalable pipeline for generating diverse synthetic full-body human motion data, an automated method for creating safe and imitation-friendly demonstrations, and an efficient 4D imitation learning method for distilling large-scale demonstrations into closed-loop policies with base-arm coordination. Experimental evaluations in both simulators and the real world show significant improvements (at least +15% success rate) over baseline methods in all cases. Experiments also validate that large-scale and diverse synthetic data greatly enhances robot learning, highlighting our scalable framework.

Cite

Text

Wang et al. "MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data." Conference on Computer Vision and Pattern Recognition, 2025. doi:10.1109/CVPR52734.2025.01614

Markdown

[Wang et al. "MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data." Conference on Computer Vision and Pattern Recognition, 2025.](https://mlanthology.org/cvpr/2025/wang2025cvpr-mobileh2r/) doi:10.1109/CVPR52734.2025.01614

BibTeX

@inproceedings{wang2025cvpr-mobileh2r,
  title     = {{MobileH2R: Learning Generalizable Human to Mobile Robot Handover Exclusively from Scalable and Diverse Synthetic Data}},
  author    = {Wang, Zifan and Chen, Ziqing and Chen, Junyu and Wang, Jilong and Yang, Yuxin and Liu, Yunze and Liu, Xueyi and Wang, He and Yi, Li},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2025},
  pages     = {17315-17325},
  doi       = {10.1109/CVPR52734.2025.01614},
  url       = {https://mlanthology.org/cvpr/2025/wang2025cvpr-mobileh2r/}
}