Sim-to-Real Transfer for Vision-and-Language Navigation

Abstract

We study the challenging problem of releasing a robot in a previously unseen environment, and having it follow unconstrained natural language navigation instructions. Recent work on the task of Vision-and-Language Navigation (VLN) has achieved significant progress in simulation. To assess the implications of this work for robotics, we transfer a VLN agent trained in simulation to a physical robot. To bridge the gap between the high-level discrete action space learned by the VLN agent, and the robot’s low-level continuous action space, we propose a subgoal model to identify nearby waypoints, and use domain randomization to mitigate visual domain differences. For accurate sim and real comparisons in parallel environments, we annotate a 325m2 office space with 1.3km of navigation instructions, and create a digitized replica in simulation. We find that sim-to-real transfer to an environment not seen in training is successful if an occupancy map and navigation graph can be collected and annotated in advance (success rate of 46.8% vs. 55.9% in sim), but much more challenging in the hardest setting with no prior mapping at all (success rate of 22.5%).

Cite

Text

Anderson et al. "Sim-to-Real Transfer for Vision-and-Language Navigation." Conference on Robot Learning, 2020.

Markdown

[Anderson et al. "Sim-to-Real Transfer for Vision-and-Language Navigation." Conference on Robot Learning, 2020.](https://mlanthology.org/corl/2020/anderson2020corl-simtoreal/)

BibTeX

@inproceedings{anderson2020corl-simtoreal,
  title     = {{Sim-to-Real Transfer for Vision-and-Language Navigation}},
  author    = {Anderson, Peter and Shrivastava, Ayush and Truong, Joanne and Majumdar, Arjun and Parikh, Devi and Batra, Dhruv and Lee, Stefan},
  booktitle = {Conference on Robot Learning},
  year      = {2020},
  pages     = {671-681},
  volume    = {155},
  url       = {https://mlanthology.org/corl/2020/anderson2020corl-simtoreal/}
}