BridgeVoC: Neural Vocoder with Schrödinger Bridge

Abstract

While previous diffusion-based neural vocoders typically follow a noise-to-data generation pipe-line, the linear-degradation prior of the mel-spectrogram is often neglected, resulting in limited generation quality. By revisiting the vocoding task and excavating its connection with the signal restoration task, this paper proposes a time-frequency (T-F) domain-based neural vocoder with the Schrödinger Bridge, called BridgeVoC, which is the first to follow the data-to-data generation paradigm. Specifically, the mel-spectrogram can be projected into the target linear-scale domain and regarded as a degraded spectral representation with a deficient rank distribution. Based on this, the Schrödinger Bridge is leveraged to establish a connection between the degraded and target data distributions. During the inference stage, starting from the degraded representation, the target spectrum can be gradually restored rather than generated from a Gaussian noise process. Quantitative experiments on LJSpeech and LibriTTS show that BridgeVoC achieves faster inference and surpasses existing diffusion-based vocoder baselines, while also matching or exceeding non-diffusion state-of-the-art methods across evaluation metrics.

Cite

Text

Lei et al. "BridgeVoC: Neural Vocoder with Schrödinger Bridge." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/903

Markdown

[Lei et al. "BridgeVoC: Neural Vocoder with Schrödinger Bridge." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/lei2025ijcai-bridgevoc/) doi:10.24963/IJCAI.2025/903

BibTeX

@inproceedings{lei2025ijcai-bridgevoc,
  title     = {{BridgeVoC: Neural Vocoder with Schrödinger Bridge}},
  author    = {Lei, Tong and Zhang, Zhiyu and Chen, Rilin and Yu, Meng and Lu, Jing and Zheng, Chengshi and Yu, Dong and Li, Andong},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8122-8130},
  doi       = {10.24963/IJCAI.2025/903},
  url       = {https://mlanthology.org/ijcai/2025/lei2025ijcai-bridgevoc/}
}