Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing

Abstract

Combinations of microphones and cameras allow the joint audio visual sensing of a scene. Such arrangements of sensors are common in biological organisms and in applications such as meeting recording and surveillance where both modalities are necessary to provide scene understanding. Microphone arrays provide geometrical information on the source location, and allow the sound sources in the scene to be separated and the noise suppressed, while cameras allow the scene geometry and the location and motion of people and other objects to be estimated. In most previous work the fusion of the audio-visual information occurs at a relatively late stage. In contrast, we take the viewpoint that both cameras and microphone arrays are geometry sensors, and treat the microphone arrays as generalized cameras. We employ computer-vision inspired algorithms to treat the combined system of arrays and cameras. In particular, we consider the geometry introduced by a general microphone array and spherical microphone arrays. The latter show a geometry that is very close to central projection cameras, and we show how standard vision based calibration algorithms can be profitably applied to them. Experiments are presented that demonstrate the usefulness of the considered approach.

Cite

Text

O'Donovan et al. "Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007. doi:10.1109/CVPR.2007.383345

Markdown

[O'Donovan et al. "Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing." IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2007.](https://mlanthology.org/cvpr/2007/oaposdonovan2007cvpr-microphone/) doi:10.1109/CVPR.2007.383345

BibTeX

@inproceedings{oaposdonovan2007cvpr-microphone,
  title     = {{Microphone Arrays as Generalized Cameras for Integrated Audio Visual Processing}},
  author    = {O'Donovan, Adam and Duraiswami, Ramani and Neumann, Jan},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  year      = {2007},
  doi       = {10.1109/CVPR.2007.383345},
  url       = {https://mlanthology.org/cvpr/2007/oaposdonovan2007cvpr-microphone/}
}