Temporally Consistent Depth Estimation in Videos with Recurrent Architectures
Abstract
Convolutional networks trained on large RGB-D datasets have enabled depth estimation from a single image. Many works on automotive applications rely on such approaches. However, all existing methods work on a frame-by-frame manner when applied to videos, which leads to inconsistent depth estimates over time. In this paper, we introduce for the first time an approach that yields temporally consistent depth estimates over multiple frames of a video. This is done by a dedicated architecture based on convolutional LSTM units and layer normalization. Our approach achieves superior performance on several error metrics when compared to independent frame processing. This also shows in an improved quality of the reconstructed multi-view point clouds.
Cite
Text
Tananaev et al. "Temporally Consistent Depth Estimation in Videos with Recurrent Architectures." European Conference on Computer Vision Workshops, 2018. doi:10.1007/978-3-030-11015-4_52Markdown
[Tananaev et al. "Temporally Consistent Depth Estimation in Videos with Recurrent Architectures." European Conference on Computer Vision Workshops, 2018.](https://mlanthology.org/eccvw/2018/tananaev2018eccvw-temporally/) doi:10.1007/978-3-030-11015-4_52BibTeX
@inproceedings{tananaev2018eccvw-temporally,
title = {{Temporally Consistent Depth Estimation in Videos with Recurrent Architectures}},
author = {Tananaev, Denis and Zhou, Huizhong and Ummenhofer, Benjamin and Brox, Thomas},
booktitle = {European Conference on Computer Vision Workshops},
year = {2018},
pages = {689-701},
doi = {10.1007/978-3-030-11015-4_52},
url = {https://mlanthology.org/eccvw/2018/tananaev2018eccvw-temporally/}
}