Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Ke, Liyiming; Li, Xiujun; Bisk, Yonatan; Holtzman, Ari; Gan, Zhe; Liu, Jingjing; Gao, Jianfeng; Choi, Yejin; Srinivasa, Siddhartha

doi:10.1109/CVPR.2019.00690

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Liyiming Ke, Xiujun Li, Yonatan Bisk, Ari Holtzman, Zhe Gan, Jingjing Liu, Jianfeng Gao, Yejin Choi, Siddhartha Srinivasa

CVPR 2019

doi:10.1109/CVPR.2019.00690 /cvpr/2019/ke2019cvpr-tactical/

Abstract

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the 2018 Room-to-Room (R2R) Vision-and-Language navigation challenge. Given a natural language instruction and photo-realistic image views of a previously unseen environment, the agent was tasked with navigating from source to target location as quickly as possible. While all current approaches make local action decisions or score entire trajectories using beam search, ours balances local and global signals when exploring an unobserved environment. Importantly, this lets us act greedily but use global signals to backtrack when necessary. Applying FAST framework to existing state-of-the-art models achieved a 17% relative gain, an absolute 6% gain on Success rate weighted by Path Length.

PDF CVPR Semantic Scholar

Cite

Text

Ke et al. "Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00690

Markdown

[Ke et al. "Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/ke2019cvpr-tactical/) doi:10.1109/CVPR.2019.00690

BibTeX

@inproceedings{ke2019cvpr-tactical,
  title     = {{Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation}},
  author    = {Ke, Liyiming and Li, Xiujun and Bisk, Yonatan and Holtzman, Ari and Gan, Zhe and Liu, Jingjing and Gao, Jianfeng and Choi, Yejin and Srinivasa, Siddhartha},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2019},
  doi       = {10.1109/CVPR.2019.00690},
  url       = {https://mlanthology.org/cvpr/2019/ke2019cvpr-tactical/}
}