Audio-Visual Foreground Extraction for Event Characterization
Abstract
This paper presents a new method able to integrate audio and visual information for scene analysis in a typical surveillance scenario, using only one camera and one monaural microphone. Visual information is analyzed by a standard visual background/foreground (BG/FG) modelling module, enhanced with a novelty detection stage, and coupled with an audio BG/FG modelling scheme. The audiovisual association is performed on-line, by exploiting the concept of synchrony. Experimental tests carrying out classification and clustering of events show all the potentialities of the proposed approach, also in comparison with the results obtained by using the single modalities.
Cite
Text
Cristani et al. "Audio-Visual Foreground Extraction for Event Characterization." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2006. doi:10.1109/CVPRW.2006.33Markdown
[Cristani et al. "Audio-Visual Foreground Extraction for Event Characterization." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2006.](https://mlanthology.org/cvprw/2006/cristani2006cvprw-audiovisual/) doi:10.1109/CVPRW.2006.33BibTeX
@inproceedings{cristani2006cvprw-audiovisual,
title = {{Audio-Visual Foreground Extraction for Event Characterization}},
author = {Cristani, Marco and Bicego, Manuele and Murino, Vittorio},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2006},
pages = {116},
doi = {10.1109/CVPRW.2006.33},
url = {https://mlanthology.org/cvprw/2006/cristani2006cvprw-audiovisual/}
}