SeDepTTS: Enhancing the Naturalness via Semantic Dependency and Local Convolution for Text-to-Speech Synthesis
Abstract
Self-attention-based networks have obtained impressive performance in parallel training and global context modeling. However, it is weak in local dependency capturing, especially for data with strong local correlations such as utterances. Therefore, we will mine linguistic information of the original text based on a semantic dependency and the semantic relationship between nodes is regarded as prior knowledge to revise the distribution of self-attention. On the other hand, given the strong correlation between input characters, we introduce a one-dimensional (1-D) convolution neural network (CNN) producing query(Q) and value(V) in the self-attention mechanism for a better fusion of local contextual information. Then, we migrate this variant of the self-attention networks to speech synthesis tasks and propose a non-autoregressive (NAR) neural Text-to-Speech (TTS): SeDepTTS. Experimental results show that our model yields good performance in speech synthesis. Specifically, the proposed method yields significant improvement for the processing of pause, stress, and intonation in speech.
Cite
Text
Jiang et al. "SeDepTTS: Enhancing the Naturalness via Semantic Dependency and Local Convolution for Text-to-Speech Synthesis." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I11.26523Markdown
[Jiang et al. "SeDepTTS: Enhancing the Naturalness via Semantic Dependency and Local Convolution for Text-to-Speech Synthesis." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/jiang2023aaai-sedeptts/) doi:10.1609/AAAI.V37I11.26523BibTeX
@inproceedings{jiang2023aaai-sedeptts,
title = {{SeDepTTS: Enhancing the Naturalness via Semantic Dependency and Local Convolution for Text-to-Speech Synthesis}},
author = {Jiang, Chenglong and Gao, Ying and Ng, Wing W. Y. and Zhou, Jiyong and Zhong, Jinghui and Zhen, Hongzhong},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {12959-12967},
doi = {10.1609/AAAI.V37I11.26523},
url = {https://mlanthology.org/aaai/2023/jiang2023aaai-sedeptts/}
}