AsiANet: Autoencoders in Autoencoder for Unsupervised Monocular Depth Estimation

Abstract

Monocular depth estimation is extremely challenging because it is inherently an ambiguous and ill-posed problem. The unsupervised approach to monocular depth estimation using convolutional neural networks is gaining a lot of interest since learning from a set of rectified stereo image pairs without ground truth depths and predicting scene geometry from a single image have become feasible. The proposed approach requires training an encoder-decoder network architecture, referred to as autoencoders in autoencoder (AsiANet), in an unsupervised fashion to discover the implicit relationship between a single image and its corresponding depth map. AsiANet uses a unique Inception-like pooling module based on fractional max-pooling for dimensionality reduction. Experiments on the KITTI benchmark dataset show that the proposed architecture trained using the Charbonnier loss function achieved superior performance on depth map prediction compared to previous unsupervised monocular depth estimation methods.

Cite

Text

Yusiong and Naval. "AsiANet: Autoencoders in Autoencoder for Unsupervised Monocular Depth Estimation." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019. doi:10.1109/WACV.2019.00053

Markdown

[Yusiong and Naval. "AsiANet: Autoencoders in Autoencoder for Unsupervised Monocular Depth Estimation." IEEE/CVF Winter Conference on Applications of Computer Vision, 2019.](https://mlanthology.org/wacv/2019/yusiong2019wacv-asianet/) doi:10.1109/WACV.2019.00053

BibTeX

@inproceedings{yusiong2019wacv-asianet,
  title     = {{AsiANet: Autoencoders in Autoencoder for Unsupervised Monocular Depth Estimation}},
  author    = {Yusiong, John Paul Tan and Naval, Prospero C.},
  booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
  year      = {2019},
  pages     = {443-451},
  doi       = {10.1109/WACV.2019.00053},
  url       = {https://mlanthology.org/wacv/2019/yusiong2019wacv-asianet/}
}