Deep Context Modeling for Semantic Segmentation
Abstract
Deep convolutional neural networks (DCNNs) have been employed in many computer vision tasks with great success due to their robustness in feature learning. One of the advantages of DCNNs is their representation robust- ness to object locations, which is useful for object recognition tasks. However, this also discards spatial information, which is useful when dealing with topological information of the image (e.g. scene parsing, face recognition). Adopting graphical models (GMs) to incorporate spatial and contextual information into the DCNNs is expected to improve the performance of DCNN-based computer vision tasks. Recent research has shown that combining DCNNs and Conditional Random Fields (CRFs) can significantly improve scene parsing accuracy. This is achieved either through the combination of their independent outputs or through their application as a cascade. In this work, we propose a novel strategy to incorporate CRFs deeper inside DCNNs by modeling a CRF as a DCNN layer which is pluggable into any layer of a DCNN. This implants spatial and contextual information into the DCNN, allowing end-to-end training, better controlling the spatial constraints and improving segmentation accuracy. The new strategy for coupling graphical models with the state-of-the-art fully convolutional neural network has shown promising results on the PASCAL-Context dataset.
Cite
Text
Thanh et al. "Deep Context Modeling for Semantic Segmentation." IEEE/CVF Winter Conference on Applications of Computer Vision, 2017. doi:10.1109/WACV.2017.14Markdown
[Thanh et al. "Deep Context Modeling for Semantic Segmentation." IEEE/CVF Winter Conference on Applications of Computer Vision, 2017.](https://mlanthology.org/wacv/2017/thanh2017wacv-deep/) doi:10.1109/WACV.2017.14BibTeX
@inproceedings{thanh2017wacv-deep,
title = {{Deep Context Modeling for Semantic Segmentation}},
author = {Thanh, Kien Nguyen and Fookes, Clinton and Sridharan, Sridha},
booktitle = {IEEE/CVF Winter Conference on Applications of Computer Vision},
year = {2017},
pages = {56-63},
doi = {10.1109/WACV.2017.14},
url = {https://mlanthology.org/wacv/2017/thanh2017wacv-deep/}
}