Domain Adaptive Pretraining for Multilingual Acronym Extraction
Abstract
This paper presents our findings from participating in the multilingual acronym extraction shared task SDU@AAAI-22. The task consists of acronym extraction from documents in 6 languages within scientific and legal domains. To address multilingual acronym extraction we employed BiLSTM-CRF with multilingual XLM-RoBERTa embeddings. We pretrained the XLM-RoBERTa model on the shared task corpus to further adapt XLM-RoBERTa embeddings to the shared task domain(s). Our system (team: SMR-NLP) achieved competitive performance for acronym extraction across all the languages.
Cite
Text
Yaseen and Langer. "Domain Adaptive Pretraining for Multilingual Acronym Extraction." AAAI Conference on Artificial Intelligence, 2022. doi:10.48550/arxiv.2206.15221Markdown
[Yaseen and Langer. "Domain Adaptive Pretraining for Multilingual Acronym Extraction." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/yaseen2022aaai-domain/) doi:10.48550/arxiv.2206.15221BibTeX
@inproceedings{yaseen2022aaai-domain,
title = {{Domain Adaptive Pretraining for Multilingual Acronym Extraction}},
author = {Yaseen, Usama and Langer, Stefan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2022},
doi = {10.48550/arxiv.2206.15221},
url = {https://mlanthology.org/aaai/2022/yaseen2022aaai-domain/}
}