Residue-Driven Architecture for Computational Auditory Scene Analysis

Abstract

The Residue-Driven Architecture presented here is a model of auditory stream segregation from input sounds. A subsystem to extract auditory streams by using some sound attributes is called an agency and the design of each agency is based on the residue-driven architecture. This architecture consists of three kinds of agents: an event-detector, a tracer-generator, and tracers. The event-detector calculates a residue by subtracting the predicted input from the actual input. When a residue exceeds a threshold value, tracer-generator generates a tracerthat extracts an auditory stream from the residue and returns a predicted input of the next time frame to the event-detector. This aproach improves the performance of segregation and the resulting system can segregate a woman's voiced stream, a man's voiced stream, and a noise stream from a mixture of these sounds. Binaural segregation is also designed by the architecture.

Cite

Text

Nakatani et al. "Residue-Driven Architecture for Computational Auditory Scene Analysis." International Joint Conference on Artificial Intelligence, 1995.

Markdown

[Nakatani et al. "Residue-Driven Architecture for Computational Auditory Scene Analysis." International Joint Conference on Artificial Intelligence, 1995.](https://mlanthology.org/ijcai/1995/nakatani1995ijcai-residue/)

BibTeX

@inproceedings{nakatani1995ijcai-residue,
  title     = {{Residue-Driven Architecture for Computational Auditory Scene Analysis}},
  author    = {Nakatani, Tomohiro and Okuno, Hiroshi G. and Kawabata, Takeshi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1995},
  pages     = {165-174},
  url       = {https://mlanthology.org/ijcai/1995/nakatani1995ijcai-residue/}
}