Tigrinya Dialect Identification

Abstract

Dialect Identification is an important topic of research in Natural Language Processing (NLP) as it has broad implications in many real-world applications such as machine translation, speech recognition and chatbots to name a few. In this work, we investigate Tigrinya dialect identification using machine learning techniques. To that end, we have identified three Tigrinya dialects, namely: Z, L and D. Then we systematically collected datasets for each dialect. Finally, we perform experiments using classical machine learning and deep learning methods to quantify effectiveness of current methods on the problem of Tigrinya dialect identification. The highest overall accuracy of 92.98\% was achieved using character-level Convolutional Neural Networks (CNNs).

Cite

Text

Haileslasie et al. "Tigrinya Dialect Identification." ICLR 2023 Workshops: AfricaNLP, 2023.

Markdown

[Haileslasie et al. "Tigrinya Dialect Identification." ICLR 2023 Workshops: AfricaNLP, 2023.](https://mlanthology.org/iclrw/2023/haileslasie2023iclrw-tigrinya/)

BibTeX

@inproceedings{haileslasie2023iclrw-tigrinya,
  title     = {{Tigrinya Dialect Identification}},
  author    = {Haileslasie, Asfaw Gedamu and Hadgu, Asmelash Teka and Abate, Solomon Teferra},
  booktitle = {ICLR 2023 Workshops: AfricaNLP},
  year      = {2023},
  url       = {https://mlanthology.org/iclrw/2023/haileslasie2023iclrw-tigrinya/}
}