How to Improve CNN-Based 6-DoF Camera Pose Estimation

ICCVW 2019 pp. 3788-3795

doi:10.1109/ICCVW.2019.00471 /iccvw/2019/seifi2019iccvw-improve/

Abstract

Convolutional neural networks (CNNs) and transfer learning have recently been used for 6 degrees of freedom (6-DoF) camera pose estimation. While they do not reach the same accuracy as visual SLAM-based approaches and are restricted to a specific environment, they excel in robustness and can be applied even to a single image. In this paper, we study PoseNet [1] and investigate modifications based on datasets' characteristics to improve the accuracy of the pose estimates. In particular, we emphasize the importance of field-of-view over image resolution; we present a data augmentation scheme to reduce overfitting; we study the effect of Long-Short-Term-Memory (LSTM) cells. Lastly, we combine these modifications and improve PoseNet's performance for monocular CNN based camera pose regression.

PDF ICCVW Semantic Scholar

Cite

Text

Seifi and Tuytelaars. "How to Improve CNN-Based 6-DoF Camera Pose Estimation." IEEE/CVF International Conference on Computer Vision Workshops, 2019. doi:10.1109/ICCVW.2019.00471

Markdown

[Seifi and Tuytelaars. "How to Improve CNN-Based 6-DoF Camera Pose Estimation." IEEE/CVF International Conference on Computer Vision Workshops, 2019.](https://mlanthology.org/iccvw/2019/seifi2019iccvw-improve/) doi:10.1109/ICCVW.2019.00471

BibTeX

@inproceedings{seifi2019iccvw-improve,
  title     = {{How to Improve CNN-Based 6-DoF Camera Pose Estimation}},
  author    = {Seifi, Soroush and Tuytelaars, Tinne},
  booktitle = {IEEE/CVF International Conference on Computer Vision Workshops},
  year      = {2019},
  pages     = {3788-3795},
  doi       = {10.1109/ICCVW.2019.00471},
  url       = {https://mlanthology.org/iccvw/2019/seifi2019iccvw-improve/}
}