CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes

Abstract

We propose a network for Congested Scene Recognition called CSRNet to provide a data-driven and deep learning method that can understand highly congested scenes and perform accurate count estimation as well as present high-quality density maps. The proposed CSRNet is composed of two major components: a convolutional neural network (CNN) as the front-end for 2D feature extraction and a dilated CNN for the back-end, which uses dilated kernels to deliver larger reception fields and to replace pooling operations. CSRNet is an easy-trained model because of its pure convolutional structure. We demonstrate CSRNet on four datasets (ShanghaiTech dataset, the UCF_CC_50 dataset, the WorldEXPO'10 dataset, and the UCSD dataset) and we deliver the state-of-the-art performance. In the ShanghaiTech Part_B dataset, CSRNet achieves 47.3% lower Mean Absolute Error (MAE) than the previous state-of-the-art method. We extend the targeted applications for counting other objects, such as the vehicle in TRANCOS dataset. Results show that CSRNet significantly improves the output quality with 15.4% lower MAE than the previous state-of-the-art approach.

Cite

Text

Li et al. "CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018. doi:10.1109/CVPR.2018.00120

Markdown

[Li et al. "CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes." Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018.](https://mlanthology.org/cvpr/2018/li2018cvpr-csrnet/) doi:10.1109/CVPR.2018.00120

BibTeX

@inproceedings{li2018cvpr-csrnet,
  title     = {{CSRNet: Dilated Convolutional Neural Networks for Understanding the Highly Congested Scenes}},
  author    = {Li, Yuhong and Zhang, Xiaofan and Chen, Deming},
  booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2018},
  doi       = {10.1109/CVPR.2018.00120},
  url       = {https://mlanthology.org/cvpr/2018/li2018cvpr-csrnet/}
}