Is Mapping Necessary for Realistic PointGoal Navigation?
Abstract
Can an autonomous agent navigate in a new environment without building an explicit map? For the task of PointGoal navigation ('Go to (x, y)') under idealized settings (no RGB-D and actuation noise, perfect GPS+Compass), the answer is a clear 'yes' - map-less neural models composed of task-agnostic components (CNNs and RNNs) trained with large-scale reinforcement learning achieve 100% Success on a standard dataset (Gibson). However, for PointNav in a realistic setting (RGB-D and actuation noise, no GPS+Compass), this is an open question; one we tackle in this paper. The strongest published result for this task is 71.7% Success. First, we identify the main (perhaps, only) cause of the drop in performance: absence of GPS+Compass. An agent with perfect GPS+Compass faced with RGB-D sensing and actuation noise achieves 99.8% Success (Gibson-v2 val). This suggests that (to paraphrase a meme) robust visual odometry is all we need for realistic PointNav; if we can achieve that, we can ignore the sensing and actuation noise. With that as our operating hypothesis, we scale dataset size, model size, and develop human-annotation-free data-augmentation techniques to train neural models for visual odometry. We advance state of the art on the Habitat Realistic PointNav Challenge - SPL by 40% (relative), 53 to 74, and Success by 31% (relative), 71 to 94. While our approach does not saturate or 'solve' this dataset, this strong improvement combined with promising zero-shot sim2real transfer (to a LoCoBot robot) provides evidence consistent with the hypothesis that explicit mapping may not be necessary for navigation, even in realistic setting.
Cite
Text
Partsey et al. "Is Mapping Necessary for Realistic PointGoal Navigation?." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01672Markdown
[Partsey et al. "Is Mapping Necessary for Realistic PointGoal Navigation?." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/partsey2022cvpr-mapping/) doi:10.1109/CVPR52688.2022.01672BibTeX
@inproceedings{partsey2022cvpr-mapping,
title = {{Is Mapping Necessary for Realistic PointGoal Navigation?}},
author = {Partsey, Ruslan and Wijmans, Erik and Yokoyama, Naoki and Dobosevych, Oles and Batra, Dhruv and Maksymets, Oleksandr},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2022},
pages = {17232-17241},
doi = {10.1109/CVPR52688.2022.01672},
url = {https://mlanthology.org/cvpr/2022/partsey2022cvpr-mapping/}
}