Shallow Diffusion for Fast Speech Enhancement (Student Abstract)

Abstract

Recently, the field of Speech Enhancement has witnessed the success of diffusion-based generative models. However, these diffusion-based methods used to take multiple iterations to generate high-quality samples, leading to high computational costs and inefficiency. In this paper, we propose SDFEN (Shallow Diffusion for Fast spEech eNhancement), a novel approach for addressing the inefficiency problem while enhancing the quality of generated samples by reducing the iterative steps in the reverse process of diffusion method. Specifically, we introduce the shallow diffusion strategy initiating the reverse process with an adaptive time step to accelerate inference. In addition, a dedicated noisy predictor is further proposed to guide the adaptive selection of time step. Experiment results demonstrate the superiority of the proposed SDFEN in effectiveness and efficiency.

Cite

Text

Lei et al. "Shallow Diffusion for Fast Speech Enhancement (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I21.30471

Markdown

[Lei et al. "Shallow Diffusion for Fast Speech Enhancement (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/lei2024aaai-shallow/) doi:10.1609/AAAI.V38I21.30471

BibTeX

@inproceedings{lei2024aaai-shallow,
  title     = {{Shallow Diffusion for Fast Speech Enhancement (Student Abstract)}},
  author    = {Lei, Yue and Chen, Bin and Tai, Wenxin and Zhong, Ting and Zhou, Fan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {23556-23558},
  doi       = {10.1609/AAAI.V38I21.30471},
  url       = {https://mlanthology.org/aaai/2024/lei2024aaai-shallow/}
}