Understanding Three Simultaneous Speeches

Abstract

Understanding three simultaneous speeches is proposed as a challenge problem to foster artificial intelligence, speech and sound understanding or recognition, and computational auditory scene analysis research. Automatic speech recognition under noisy environments is attacked by speech enhancement techniques such as noise reduction and speaker adaptation. However, the signal-to-noise ratio of speech in two simultaneous speeches is too poor to apply these techniques. Therefore, novel techniques need to be developed. One candidate is to use speech stream segregation as a front-end of automatic speech recognition systems. Preliminary experiments on understanding two simultaneous speeches show that the proposed challenge problem will be feasible with speech stream segregation. The detailed plan of the research on and benchmarks for the proposed challenge problem is also presented. 1 Introduction Recently emerges a new research on understanding arbitrary sound mixtures including non-speech...

Cite

Text

Okuno et al. "Understanding Three Simultaneous Speeches." International Joint Conference on Artificial Intelligence, 1997.

Markdown

[Okuno et al. "Understanding Three Simultaneous Speeches." International Joint Conference on Artificial Intelligence, 1997.](https://mlanthology.org/ijcai/1997/okuno1997ijcai-understanding/)

BibTeX

@inproceedings{okuno1997ijcai-understanding,
  title     = {{Understanding Three Simultaneous Speeches}},
  author    = {Okuno, Hiroshi G. and Nakatani, Tomohiro and Kawabata, Takeshi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1997},
  pages     = {30-35},
  url       = {https://mlanthology.org/ijcai/1997/okuno1997ijcai-understanding/}
}