Variational Inference in Non-Negative Factorial Hidden Markov Models for Efficient Audio Source Separation

Abstract

The past decade has seen substantial work on the use of non-negative matrix factorization and its probabilistic counterparts for audio source separation. Although able to capture audio spectral structure well, these models neglect the non-stationarity and temporal dynamics that are important properties of audio. The recently proposed non-negative factorial hidden Markov model (N-FHMM) introduces a temporal dimension and improves source separation performance. However, the factorial nature of this model makes the complexity of inference exponential in the number of sound sources. Here, we present a Bayesian variant of the N-FHMM suited to an efficient variational inference algorithm, whose complexity is linear in the number of sound sources. Our algorithm performs comparably to exact inference in the original N-FHMM but is significantly faster. In typical configurations of the N-FHMM, our method achieves around a 30x increase in speed.

Cite

Text

Mysore and Sahani. "Variational Inference in Non-Negative Factorial Hidden Markov Models for Efficient Audio Source Separation." International Conference on Machine Learning, 2012.

Markdown

[Mysore and Sahani. "Variational Inference in Non-Negative Factorial Hidden Markov Models for Efficient Audio Source Separation." International Conference on Machine Learning, 2012.](https://mlanthology.org/icml/2012/mysore2012icml-variational/)

BibTeX

@inproceedings{mysore2012icml-variational,
  title     = {{Variational Inference in Non-Negative Factorial Hidden Markov Models for Efficient Audio Source Separation}},
  author    = {Mysore, Gautham J. and Sahani, Maneesh},
  booktitle = {International Conference on Machine Learning},
  year      = {2012},
  url       = {https://mlanthology.org/icml/2012/mysore2012icml-variational/}
}