HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks

Abstract

Implicit neural representations (INRs) are a rapidly growing research field, which provides alternative ways to represent multimedia signals. Recent applications of INRs include image super-resolution, compression of high-dimensional signals, or 3D rendering. However, these solutions usually focus on visual data, and adapting them to the audio domain is not trivial. Moreover, it requires a separately trained model for every data sample. To address this limitation, we propose HyperSound, a meta-learning method leveraging hypernetworks to produce INRs for audio signals unseen at training time. We show that our approach can reconstruct sound waves with quality comparable to other state-of-the-art models.

Cite

Text

Szatkowski et al. "HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks." NeurIPS 2022 Workshops: MetaLearn, 2022.

Markdown

[Szatkowski et al. "HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks." NeurIPS 2022 Workshops: MetaLearn, 2022.](https://mlanthology.org/neuripsw/2022/szatkowski2022neuripsw-hypersound/)

BibTeX

@inproceedings{szatkowski2022neuripsw-hypersound,
  title     = {{HyperSound: Generating Implicit Neural Representations of Audio Signals with Hypernetworks}},
  author    = {Szatkowski, Filip and Piczak, Karol J. and Spurek, Przemysław and Tabor, Jacek and Trzcinski, Tomasz},
  booktitle = {NeurIPS 2022 Workshops: MetaLearn},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/szatkowski2022neuripsw-hypersound/}
}