NAS-Count: Counting-by-Density with Neural Architecture Search

Abstract

Most of the recent advances in crowd counting have evolved from hand-designed density estimation networks, where multi-scale features are leveraged to address the scale variation problem, but at the expense of demanding design efforts. In this work, we automate the design of counting models with Neural Architecture Search (NAS) and introduce an end-to-end searched encoder-decoder architecture, Automatic Multi-Scale Network (AMSNet). Specifically, we utilize a counting-specific two-level search space. The encoder and decoder in AMSNet are composed of different cells discovered from micro-level search, while the multi-path architecture is explored through macro-level search. To solve the pixel-level isolation issue in MSE loss, AMSNet is optimized with an auto-searched Scale Pyramid Pooling Loss (SPPLoss) that supervises the multi-scale structural information. Extensive experiments on four datasets show AMSNet produces state-of-the-art results that outperform hand-designed models, fully demonstrating the efficacy of NAS-Count.

Cite

Text

Hu et al. "NAS-Count: Counting-by-Density with Neural Architecture Search." Proceedings of the European Conference on Computer Vision (ECCV), 2020. doi:10.1007/978-3-030-58542-6_45

Markdown

[Hu et al. "NAS-Count: Counting-by-Density with Neural Architecture Search." Proceedings of the European Conference on Computer Vision (ECCV), 2020.](https://mlanthology.org/eccv/2020/hu2020eccv-nascount/) doi:10.1007/978-3-030-58542-6_45

BibTeX

@inproceedings{hu2020eccv-nascount,
  title     = {{NAS-Count: Counting-by-Density with Neural Architecture Search}},
  author    = {Hu, Yutao and Jiang, Xiaolong and Liu, Xuhui and Zhang, Baochang and Han, Jungong and Cao, Xianbin and Doermann, David},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2020},
  doi       = {10.1007/978-3-030-58542-6_45},
  url       = {https://mlanthology.org/eccv/2020/hu2020eccv-nascount/}
}