Time-Domain, Digital Segmentation of Connected Natural Speech

Abstract

The digital segmentation algorithm described in this paper subdivides speech signals into discrete sections which permit to localize most of the spoken phonemes in natural speech. Two pre-segmentation steps separate pauses and voiceless parts from the (voiced) rest of the signal. The subsequent main segmentation step tries to describe the speed of articulation in the vocal tract according to some global speech parameters. Since, during an utterance, the vocal tract does not move at constant speed, but attempts to realize the articulatory target position associated with each phoneme, sections with relatively low changes of vocal tract position (stationary segments) and sections with greater changes (dynamic segments) can be separated. The dynamic segments can be further characterized when the direction of change in the course of the parameters is regarded.

Cite

Text

Hoss. "Time-Domain, Digital Segmentation of Connected Natural Speech." International Joint Conference on Artificial Intelligence, 1975.

Markdown

[Hoss. "Time-Domain, Digital Segmentation of Connected Natural Speech." International Joint Conference on Artificial Intelligence, 1975.](https://mlanthology.org/ijcai/1975/hoss1975ijcai-time/)

BibTeX

@inproceedings{hoss1975ijcai-time,
  title     = {{Time-Domain, Digital Segmentation of Connected Natural Speech}},
  author    = {Hoss, W.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1975},
  pages     = {491-498},
  url       = {https://mlanthology.org/ijcai/1975/hoss1975ijcai-time/}
}