Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection

Abstract

People detection in 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly resolve ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-the-art algorithms on challenging scenes.

Cite

Text

Baque et al. "Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.38

Markdown

[Baque et al. "Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/baque2017iccv-deep/) doi:10.1109/ICCV.2017.38

BibTeX

@inproceedings{baque2017iccv-deep,
  title     = {{Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection}},
  author    = {Baque, Pierre and Fleuret, Francois and Fua, Pascal},
  booktitle = {International Conference on Computer Vision},
  year      = {2017},
  doi       = {10.1109/ICCV.2017.38},
  url       = {https://mlanthology.org/iccv/2017/baque2017iccv-deep/}
}