DOC: Deep OCclusion Estimation from a Single Image
Abstract
In this paper, we propose a deep convolutional network architecture, called DOC, which detects object boundaries and estimates the occlusion relationships (i.e. which side of the boundary is foreground and which is background). Specifically, we first represent occlusion relations by a binary edge indicator, to indicate the object boundary, and an occlusion orientation variable whose direction specifies the occlusion relationships by a left-hand rule, see Fig. 1. Then, our DOC networks exploit local and non-local image cues to learn and estimate this representation and hence recover occlusion relations. To train and test DOC, we construct a large-scale instance occlusion boundary dataset using PASCAL VOC images, which we call the PASCAL instance occlusion dataset (PIOD). It contains 10,000 images and hence is two orders of magnitude larger than existing occlusion datasets for outdoor images. We test two variants of DOC on PIOD and on the BSDS ownership dataset and show they outperform state-of-the-art methods typically by more than 5AP. Finally, we perform numerous experiments investigating multiple settings of DOC and transfer between BSDS and PIOD, which provides more insights for further study of occlusion estimation.
Cite
Text
Wang and Yuille. "DOC: Deep OCclusion Estimation from a Single Image." European Conference on Computer Vision, 2016. doi:10.1007/978-3-319-46448-0_33Markdown
[Wang and Yuille. "DOC: Deep OCclusion Estimation from a Single Image." European Conference on Computer Vision, 2016.](https://mlanthology.org/eccv/2016/wang2016eccv-doc/) doi:10.1007/978-3-319-46448-0_33BibTeX
@inproceedings{wang2016eccv-doc,
title = {{DOC: Deep OCclusion Estimation from a Single Image}},
author = {Wang, Peng and Yuille, Alan L.},
booktitle = {European Conference on Computer Vision},
year = {2016},
pages = {545-561},
doi = {10.1007/978-3-319-46448-0_33},
url = {https://mlanthology.org/eccv/2016/wang2016eccv-doc/}
}