MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching

Abstract

Recent methods in stereo matching have continuously improved the accuracy using deep models. This gain, however, is attained with a high increase in computation cost, such that the network may not fit even on a moderate GPU. This issue raises problems when the model needs to be deployed on resource-limited devices. For this, we propose two light models for stereo vision with reduced complexity and without sacrificing accuracy. Depending on the dimension of cost volume, we design a 2D and a 3D model with encoder-decoders built from 2D and 3D convolutions, respectively. To this end, we leverage 2D MobileNet blocks and extend them to 3D for stereo vision application. Besides, a new cost volume is proposed to boost the accuracy of the 2D model, making it performing close to 3D networks. Experiments show that the proposed 2D/3D networks effectively reduce the computational expense (27%/95% and 72%/38% fewer parameters/operations in 2D and 3D models, respectively) while upholding the accuracy. Code: https://github.com/cogsys-tuebingen/mobilestereonet.

Cite

Text

Shamsafar et al. "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching." Winter Conference on Applications of Computer Vision, 2022.

Markdown

[Shamsafar et al. "MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching." Winter Conference on Applications of Computer Vision, 2022.](https://mlanthology.org/wacv/2022/shamsafar2022wacv-mobilestereonet/)

BibTeX

@inproceedings{shamsafar2022wacv-mobilestereonet,
  title     = {{MobileStereoNet: Towards Lightweight Deep Networks for Stereo Matching}},
  author    = {Shamsafar, Faranak and Woerz, Samuel and Rahim, Rafia and Zell, Andreas},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2022},
  pages     = {2417-2426},
  url       = {https://mlanthology.org/wacv/2022/shamsafar2022wacv-mobilestereonet/}
}