Semantic Learning for Audio Applications: A Computer Vision Approach

Abstract

Recent work in machine learning has significantly benefited semantic extraction tasks in computer vision, particularly for object recognition and image retrieval. We argue that the computer vision techniques that have been successfully applied in those settings can effectively be translated to other domains, such as audio. This claim is supported by recent results in music vs. speech classification, structure from sound, robust music identification and sound object recognition. This paper focuses on two such audio applications and demonstrates how ideas from computer vision map naturally to these problems.

Cite

Text

Sukthankar et al. "Semantic Learning for Audio Applications: A Computer Vision Approach." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2006. doi:10.1109/CVPRW.2006.191

Markdown

[Sukthankar et al. "Semantic Learning for Audio Applications: A Computer Vision Approach." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2006.](https://mlanthology.org/cvprw/2006/sukthankar2006cvprw-semantic/) doi:10.1109/CVPRW.2006.191

BibTeX

@inproceedings{sukthankar2006cvprw-semantic,
  title     = {{Semantic Learning for Audio Applications: A Computer Vision Approach}},
  author    = {Sukthankar, Rahul and Ke, Yan and Hoiem, Derek},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2006},
  pages     = {112},
  doi       = {10.1109/CVPRW.2006.191},
  url       = {https://mlanthology.org/cvprw/2006/sukthankar2006cvprw-semantic/}
}