A Probabilistic Framework for Multi-Modal Multi-Person Tracking

Abstract

In this paper, we present a probabilistic tracking framework that combines sound and vision to achieve more robust and accurate tracking of multiple objects. In a cluttered or noisy scene, our measurements have a non-Gaussian, multi-modal distribution. We apply a particle filter to track multiple people using combined audio and video observations. We have applied our algorithm to the domain of tracking people with a stereo-based visual foreground detection algorithm and audio localization using a beamforming technique. Our model also accurately reflects the number of people present. We test the efficacy of our system on a sequence of multiple people moving and speaking in an indoor environment.

Cite

Text

Checka et al. "A Probabilistic Framework for Multi-Modal Multi-Person Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2003. doi:10.1109/CVPRW.2003.10099

Markdown

[Checka et al. "A Probabilistic Framework for Multi-Modal Multi-Person Tracking." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2003.](https://mlanthology.org/cvprw/2003/checka2003cvprw-probabilistic/) doi:10.1109/CVPRW.2003.10099

BibTeX

@inproceedings{checka2003cvprw-probabilistic,
  title     = {{A Probabilistic Framework for Multi-Modal Multi-Person Tracking}},
  author    = {Checka, Neal and Wilson, Kevin W. and Rangarajan, Vibhav and Darrell, Trevor},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2003},
  pages     = {100},
  doi       = {10.1109/CVPRW.2003.10099},
  url       = {https://mlanthology.org/cvprw/2003/checka2003cvprw-probabilistic/}
}