Connectionist Speaker Normalization with Generalized Resource Allocating Networks
Abstract
The paper presents a rapid speaker-normalization technique based on neural network spectral mapping. The neural network is used as a front-end of a continuous speech recognition system (speaker(cid:173) dependent, HMM-based) to normalize the input acoustic data from a new speaker. The spectral difference between speakers can be reduced using a limited amount of new acoustic data (40 phonet(cid:173) ically rich sentences). Recognition error of phone units from the acoustic-phonetic continuous speech corpus APASCI is decreased with an adaptability ratio of 25%. We used local basis networks of elliptical Gaussian kernels, with recursive allocation of units and on-line optimization of parameters (GRAN model). For this ap(cid:173) plication, the model included a linear term. The results compare favorably with multivariate linear mapping based on constrained orthonormal transformations.
Cite
Text
Furlanello et al. "Connectionist Speaker Normalization with Generalized Resource Allocating Networks." Neural Information Processing Systems, 1994.Markdown
[Furlanello et al. "Connectionist Speaker Normalization with Generalized Resource Allocating Networks." Neural Information Processing Systems, 1994.](https://mlanthology.org/neurips/1994/furlanello1994neurips-connectionist/)BibTeX
@inproceedings{furlanello1994neurips-connectionist,
title = {{Connectionist Speaker Normalization with Generalized Resource Allocating Networks}},
author = {Furlanello, Cesare and Giuliani, Diego and Trentin, Edmondo},
booktitle = {Neural Information Processing Systems},
year = {1994},
pages = {865-874},
url = {https://mlanthology.org/neurips/1994/furlanello1994neurips-connectionist/}
}