Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition
Abstract
A set of Multi-Layered Networks (MLN) for Automatic Speech Recognition (ASR) is proposed. Such a set allows the integration of information extracted with variable resolution in the time and frequency domains and to keep the number of links between nodes of the networks small in order to allow significant generalization during learning with a reasonable training set size. Subsets of networks can be executed depending on preconditions based on descriptions of the time evolution of signal energies allowing spectral properties that are significant in different acoustic situations to be learned. Preliminary experiments on speaker-independent recognition of the letters of the E-set are reported. Voices from 70 speakers were used for learning. Voices of 10 new speakers were used for test. An overall error rate of 9.5% was obtained in the test showing that results better than those previously reported can be achieved.
Cite
Text
de Mori et al. "Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition." AAAI Conference on Artificial Intelligence, 1988.Markdown
[de Mori et al. "Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition." AAAI Conference on Artificial Intelligence, 1988.](https://mlanthology.org/aaai/1988/demori1988aaai-data/)BibTeX
@inproceedings{demori1988aaai-data,
title = {{Data-Driven Execution of Multi-Layered Networks for Automatic Speech Recognition}},
author = {de Mori, Renato and Bengio, Yoshua and Cardin, Régis},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1988},
pages = {734-738},
url = {https://mlanthology.org/aaai/1988/demori1988aaai-data/}
}