A Probabilistic Framework for Multi-Modal Multi-Person Tracking
Abstract
In this paper, we present a probabilistic tracking framework that combines sound and vision to achieve more robust and accurate tracking of multiple objects. In a cluttered or noisy scene, our measurements have a non-Gaussian, multi-modal distribution. We apply a particle filter to track multiple people using combined audio and video observations. We have applied our algorithm to the domain of tracking people with a stereo-based visual foreground detection algorithm and audio localization using a beamforming technique. Our model also accurately reflects the number of people present. We test the efficacy of our system on a sequence of multiple people moving and speaking in an indoor environment.
Cite
Text
Checka et al. "A Probabilistic Framework for Multi-Modal Multi-Person Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2003. doi:10.1109/CVPRW.2003.10099Markdown
[Checka et al. "A Probabilistic Framework for Multi-Modal Multi-Person Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2003.](https://mlanthology.org/cvprw/2003/checka2003cvprw-probabilistic/) doi:10.1109/CVPRW.2003.10099BibTeX
@inproceedings{checka2003cvprw-probabilistic,
title = {{A Probabilistic Framework for Multi-Modal Multi-Person Tracking}},
author = {Checka, Neal and Wilson, Kevin W. and Rangarajan, Vibhav and Darrell, Trevor},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2003},
pages = {100},
doi = {10.1109/CVPRW.2003.10099},
url = {https://mlanthology.org/cvprw/2003/checka2003cvprw-probabilistic/}
}