Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids
Abstract
Simulation-based reinforcement learning (RL) has significantly advanced humanoid locomotion tasks, yet direct real-world RL from scratch or starting from pretrained policies remains rare, limiting the full potential of humanoid robots. Real-world training, despite being crucial for overcoming the sim-to-real gap, faces substantial challenges related to safety, reward design, and learning efficiency. To address these limitations, we propose Robot-Trains-Robot (RTR), a novel framework where a robotic arm teacher actively supports and guides a humanoid student robot. The RTR system provides protection, schedule, reward, perturbation, failure detection, and automatic resets, enabling efficient long-term real-world training with minimal human intervention. Furthermore, we propose a novel RL pipeline that facilitates and stabilizes sim-to-real transfer by optimizing a single dynamics-encoded latent variable in the real world. We validate our method through two challenging real-world humanoid tasks: fine-tuning a walking policy for precise speed tracking and learning a humanoid swing-up task from scratch, illustrating the promising capabilities of real-world humanoid learning realized by RTR-style systems.
Cite
Text
Hu et al. "Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids." Proceedings of The 9th Conference on Robot Learning, 2025.Markdown
[Hu et al. "Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids." Proceedings of The 9th Conference on Robot Learning, 2025.](https://mlanthology.org/corl/2025/hu2025corl-robot/)BibTeX
@inproceedings{hu2025corl-robot,
title = {{Robot Trains Robot: Automatic Real-World Policy Adaptation and Learning for Humanoids}},
author = {Hu, Kaizhe and Shi, Haochen and He, Yao and Wang, Weizhuo and Liu, Karen and Song, Shuran},
booktitle = {Proceedings of The 9th Conference on Robot Learning},
year = {2025},
pages = {1672-1689},
volume = {305},
url = {https://mlanthology.org/corl/2025/hu2025corl-robot/}
}