SC-wLS: Towards Interpretable Feed-Forward Camera Re-Localization
Abstract
Visual re-localization aims to recover camera poses in a known environment, which is vital for applications like robotics or augmented reality. Feed-forward absolute camera pose regression methods directly output poses by a network, but suffer from low accuracy. Meanwhile, scene coordinate based methods are accurate, but need iterative RANSAC post-processing, which brings challenges to efficient end-to-end training and inference. In order to have the best of both worlds, we propose a feed-forward method termed SC-wLS that exploits all scene coordinate estimates for weighted least squares pose regression. This differentiable formulation exploits a weight network imposed on 2D-3D correspondences, and requires pose supervision only. Qualitative results demonstrate the interpretability of learned weights. Evaluations on 7Scenes and Cambridge datasets show significantly promoted performance when compared with former feed-forward counterparts. Moreover, our SC-wLS method enables a new capability: self-supervised test-time adaptation on the weight network. Codes and models are publicly available.
Cite
Text
Wu et al. "SC-wLS: Towards Interpretable Feed-Forward Camera Re-Localization." Proceedings of the European Conference on Computer Vision (ECCV), 2022. doi:10.1007/978-3-031-19769-7_34Markdown
[Wu et al. "SC-wLS: Towards Interpretable Feed-Forward Camera Re-Localization." Proceedings of the European Conference on Computer Vision (ECCV), 2022.](https://mlanthology.org/eccv/2022/wu2022eccv-scwls/) doi:10.1007/978-3-031-19769-7_34BibTeX
@inproceedings{wu2022eccv-scwls,
title = {{SC-wLS: Towards Interpretable Feed-Forward Camera Re-Localization}},
author = {Wu, Xin and Zhao, Hao and Li, Shunkai and Cao, Yingdian and Zha, Hongbin},
booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
year = {2022},
doi = {10.1007/978-3-031-19769-7_34},
url = {https://mlanthology.org/eccv/2022/wu2022eccv-scwls/}
}