Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids

Abstract

Learning generalizable robot manipulation policies, especially for complex multi-fingered humanoids, remains a significant challenge. Existing approaches primarily rely on extensive data collection and imitation learning, which are expensive, labor-intensive, and difficult to scale. Sim-to-real reinforcement learning (RL) offers a promising alternative, but has mostly succeeded in simpler state-based or single-hand setups. How to effectively extend this to vision-based, contact-rich bimanual manipulation tasks remains an open question. In this paper, we introduce a practical sim-to-real RL recipe that trains a humanoid robot to perform three challenging dexterous manipulation tasks: grasp-and-reach, box lift and bimanual handover. Our method features an automated real-to-sim tuning module, a generalized reward formulation based on contact and object goals, a divide-and-conquer policy distillation framework, and a hybrid object representation strategy with modality-specific augmentation. We demonstrate high success rates on unseen objects and robust, adaptive policy behaviors – highlighting that vision-based dexterous manipulation via sim-to-real RL is not only viable, but also scalable and broadly applicable to real-world humanoid manipulation tasks.

Cite

Text

Lin et al. "Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids." Proceedings of The 9th Conference on Robot Learning, 2025.

Markdown

[Lin et al. "Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/lin2025corl-simtoreal/)

BibTeX

@inproceedings{lin2025corl-simtoreal,
  title     = {{Sim-to-Real Reinforcement Learning for Vision-Based Dexterous Manipulation on Humanoids}},
  author    = {Lin, Toru and Sachdev, Kartik and Fan, Linxi and Malik, Jitendra and Zhu, Yuke},
  booktitle = {Proceedings of The 9th Conference on Robot Learning},
  year      = {2025},
  pages     = {4926-4940},
  volume    = {305},
  url       = {https://mlanthology.org/corl/2025/lin2025corl-simtoreal/}
}