Learning Neural Vocoder from Range-Null Space Decomposition

Abstract

Despite the rapid development of neural vocoders in recent years, they usually suffer from some intrinsic challenges like opaque modeling, and parameter-performance trade-off. In this study, we propose an innovative time-frequency (T-F) domain-based neural vocoder to resolve the above-mentioned challenges. To be specific, we bridge the connection between the classical signal range-null decomposition (RND) theory and vocoder task, and the reconstruction of target spectrogram can be decomposed into the superimposition between the range-space and null-space, where the former is enabled by a linear domain shift from the original mel-scale domain to the target linear-scale domain, and the latter is instantiated via a learnable network for further spectral detail generation. Accordingly, we propose a novel dual-path framework, where the spectrum is hierarchically encoded/decoded, and the cross- and narrow-band modules are elaborately devised for efficient sub-band and sequential modeling. Comprehensive experiments are conducted on the LJSpeech and LibriTTS benchmarks. Quantitative and qualitative results show that while enjoying lightweight network parameters, the proposed approach yields state-of-the-art performance among existing advanced methods. Our code and the pretrained model weights are available at https://github.com/Andong-Li-speech/RNDVoC.

Cite

Text

Li et al. "Learning Neural Vocoder from Range-Null Space Decomposition." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/904

Markdown

[Li et al. "Learning Neural Vocoder from Range-Null Space Decomposition." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/li2025ijcai-learning/) doi:10.24963/IJCAI.2025/904

BibTeX

@inproceedings{li2025ijcai-learning,
  title     = {{Learning Neural Vocoder from Range-Null Space Decomposition}},
  author    = {Li, Andong and Lei, Tong and Sun, Zhihang and Chen, Rilin and Yin, Erwei and Li, Xiaodong and Zheng, Chengshi},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8131-8140},
  doi       = {10.24963/IJCAI.2025/904},
  url       = {https://mlanthology.org/ijcai/2025/li2025ijcai-learning/}
}