Mind Your Language: Abuse and Offense Detection for Code-Switched Languages
Abstract
In multilingual societies like the Indian subcontinent, use of code-switched languages is much popular and convenient for the users. In this paper, we study offense and abuse detection in the code-switched pair of Hindi and English (i.e, Hinglish), the pair that is the most spoken. The task is made difficult due to non-fixed grammar, vocabulary, semantics and spellings of Hinglish language. We apply transfer learning and make a LSTM based model for hate speech classification. This model surpasses the performance shown by the current best models to establish itself as the state-of-the-art in the unexplored domain of Hinglish offensive text classification. We also release our model and the embeddings trained for research purposes.
Cite
Text
Kapoor et al. "Mind Your Language: Abuse and Offense Detection for Code-Switched Languages." AAAI Conference on Artificial Intelligence, 2019. doi:10.1609/AAAI.V33I01.33019951Markdown
[Kapoor et al. "Mind Your Language: Abuse and Offense Detection for Code-Switched Languages." AAAI Conference on Artificial Intelligence, 2019.](https://mlanthology.org/aaai/2019/kapoor2019aaai-mind/) doi:10.1609/AAAI.V33I01.33019951BibTeX
@inproceedings{kapoor2019aaai-mind,
title = {{Mind Your Language: Abuse and Offense Detection for Code-Switched Languages}},
author = {Kapoor, Raghav and Kumar, Yaman and Rajput, Kshitij and Shah, Rajiv Ratn and Kumaraguru, Ponnurangam and Zimmermann, Roger},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2019},
pages = {9951-9952},
doi = {10.1609/AAAI.V33I01.33019951},
url = {https://mlanthology.org/aaai/2019/kapoor2019aaai-mind/}
}