AfriNames: Most ASR Models "butcher" African Names

Abstract

Useful conversational agents must accurately capture named entities to minimize error for downstream tasks, for example, asking a voice assistant to play a track from a certain artist, initiating navigation to a specific location, or documenting a diagnosis result for a specific patient. However, where named entities such as "Ukachukwu" (Igbo), "Lakicia" (Swahili), or "Ingabire" (Rwandan) are spoken, automatic speech recognition (ASR) models' performance degrades significantly, propagating errors to downstream systems. We model this problem as a distribution shift and demonstrate that such model bias can be mitigated through multilingual pre-training, intelligent data augmentation strategies to increase the representation of African named entities, and fine-tuning multilingual ASR models on multiple African accents. The resulting fine-tuned models show an 86.4% relative improvement compared with the baseline on samples with African named entities.

Cite

Text

Olatunji et al. "AfriNames: Most ASR Models "butcher" African Names." ICLR 2023 Workshops: AfricaNLP, 2023.

Markdown

[Olatunji et al. "AfriNames: Most ASR Models "butcher" African Names." ICLR 2023 Workshops: AfricaNLP, 2023.](https://mlanthology.org/iclrw/2023/olatunji2023iclrw-afrinames/)

BibTeX

@inproceedings{olatunji2023iclrw-afrinames,
  title     = {{AfriNames: Most ASR Models "butcher" African Names}},
  author    = {Olatunji, Tobi and Afonja, Tejumade and Dossou, Bonaventure F. P. and Tonja, Atnafu Lambebo and Emezue, Chris Chinenye and Rufai, Amina Mardiyyah and Singh, Sahib},
  booktitle = {ICLR 2023 Workshops: AfricaNLP},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/olatunji2023iclrw-afrinames/}
}