Towards Generalist Robot Learning from Internet Video: A Survey
Abstract
Scaling deep learning to massive and diverse internet data has driven remarkable breakthroughs in domains such as video generation and natural language processing. Robot learning, however, has thus far failed to replicate this success and remains constrained by a scarcity of available data. Learning from Videos (LfV) methods aim to address this data bottleneck by augmenting traditional robot data with large-scale internet video. This video data provides foundational information regarding physical dynamics, behaviours, and tasks, and can be highly informative for general-purpose robots. This survey systematically examines the emerging field of LfV. We first outline essential concepts, including detailing fundamental LfV challenges such as distribution shift and missing action labels in video data. Next, we comprehensively review current methods for extracting knowledge from large-scale internet video, overcoming LfV challenges, and improving robot learning through video-informed training. The survey concludes with a critical discussion of future opportunities. Here, we emphasize the need for scalable foundation model approaches that can leverage the full range of available internet video and enhance the learning of robot policies and dynamics models. Overall, the survey aims to inform and catalyse future LfV research, driving progress towards general-purpose robots.
Cite
Text
McCarthy et al. "Towards Generalist Robot Learning from Internet Video: A Survey." Journal of Artificial Intelligence Research, 2025. doi:10.1613/JAIR.1.17400Markdown
[McCarthy et al. "Towards Generalist Robot Learning from Internet Video: A Survey." Journal of Artificial Intelligence Research, 2025.](https://mlanthology.org/jair/2025/mccarthy2025jair-generalist/) doi:10.1613/JAIR.1.17400BibTeX
@article{mccarthy2025jair-generalist,
title = {{Towards Generalist Robot Learning from Internet Video: A Survey}},
author = {McCarthy, Robert and Tan, Daniel Chee Hian and Schmidt, Dominik and Acero, Fernando and Herr, Nathan and Du, Yilun and Thuruthel, Thomas George and Li, Zhibin},
journal = {Journal of Artificial Intelligence Research},
year = {2025},
doi = {10.1613/JAIR.1.17400},
volume = {83},
url = {https://mlanthology.org/jair/2025/mccarthy2025jair-generalist/}
}