DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation
Abstract
This paper introduces an extremely efficient CNN architecture named DFANet for semantic segmentation under resource constraints. Our proposed network starts from a single lightweight backbone and aggregates discriminative features through sub-network and sub-stage cascade respectively. Based on the multi-scale feature propagation, DFANet substantially reduces the number of parameters, but still obtains sufficient receptive field and enhances the model learning ability, which strikes a balance between the speed and segmentation performance. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of DFANet with 8xless FLOPs and 2xfaster than the existing state-of-the-art real-time semantic segmentation methods while providing comparable accuracy. Specifically, it achieves 70.3% Mean IOU on the Cityscapes test dataset with only 1.7 GFLOPs and a speed of 160 FPS on one NVIDIA Titan X card, and 71.3% Mean IOU with 3.4 GFLOPs while inferring on a higher resolution image.
Cite
Text
Li et al. "DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019. doi:10.1109/CVPR.2019.00975Markdown
[Li et al. "DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019.](https://mlanthology.org/cvpr/2019/li2019cvpr-dfanet/) doi:10.1109/CVPR.2019.00975BibTeX
@inproceedings{li2019cvpr-dfanet,
title = {{DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation}},
author = {Li, Hanchao and Xiong, Pengfei and Fan, Haoqiang and Sun, Jian},
booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
year = {2019},
doi = {10.1109/CVPR.2019.00975},
url = {https://mlanthology.org/cvpr/2019/li2019cvpr-dfanet/}
}