Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection
Abstract
People detection in 2D images has improved greatly in recent years. However, comparatively little of this progress has percolated into multi-camera multi-people tracking algorithms, whose performance still degrades severely when scenes become very crowded. In this work, we introduce a new architecture that combines Convolutional Neural Nets and Conditional Random Fields to explicitly resolve ambiguities. One of its key ingredients are high-order CRF terms that model potential occlusions and give our approach its robustness even when many people are present. Our model is trained end-to-end and we show that it outperforms several state-of-the-art algorithms on challenging scenes.
Cite
Text
Baque et al. "Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection." International Conference on Computer Vision, 2017. doi:10.1109/ICCV.2017.38Markdown
[Baque et al. "Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection." International Conference on Computer Vision, 2017.](https://mlanthology.org/iccv/2017/baque2017iccv-deep/) doi:10.1109/ICCV.2017.38BibTeX
@inproceedings{baque2017iccv-deep,
title = {{Deep Occlusion Reasoning for Multi-Camera Multi-Target Detection}},
author = {Baque, Pierre and Fleuret, Francois and Fua, Pascal},
booktitle = {International Conference on Computer Vision},
year = {2017},
doi = {10.1109/ICCV.2017.38},
url = {https://mlanthology.org/iccv/2017/baque2017iccv-deep/}
}