Modeling Prosody Automatically in Concept-to-Speech Generation
Abstract
A Concept-to-Speech (CTS) Generator is a system which integrates language generation with speech synthesis and produces speech from semantic representations. This is in contrast to Text-to-Speech (TTS) systems where speech is produced from text. CTS systems have an advantage over TTS because of the availability of semantic and pragmatic information, which are considered crucial for prosody generation, a process which models the variations in pitch, tempo and rhythm. My goal is to build a CTS system which produces more natural and intelligible speech than TTS. The CTS system is being developed as part of MAGIC (Dalal et al. 1996), a multimedia presentation generation system for health-care domain.
Cite
Text
Pan. "Modeling Prosody Automatically in Concept-to-Speech Generation." AAAI Conference on Artificial Intelligence, 1999. doi:10.7916/d8t72rs2Markdown
[Pan. "Modeling Prosody Automatically in Concept-to-Speech Generation." AAAI Conference on Artificial Intelligence, 1999.](https://mlanthology.org/aaai/1999/pan1999aaai-modeling/) doi:10.7916/d8t72rs2BibTeX
@inproceedings{pan1999aaai-modeling,
title = {{Modeling Prosody Automatically in Concept-to-Speech Generation}},
author = {Pan, Shimei},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1999},
pages = {952},
doi = {10.7916/d8t72rs2},
url = {https://mlanthology.org/aaai/1999/pan1999aaai-modeling/}
}