Domain Adaptive Pretraining for Multilingual Acronym Extraction

Yaseen, Usama; Langer, Stefan

doi:10.48550/arxiv.2206.15221

Domain Adaptive Pretraining for Multilingual Acronym Extraction

Usama Yaseen, Stefan Langer

AAAI 2022

doi:10.48550/arxiv.2206.15221 /aaai/2022/yaseen2022aaai-domain/

Abstract

This paper presents our findings from participating in the multilingual acronym extraction shared task SDU@AAAI-22. The task consists of acronym extraction from documents in 6 languages within scientific and legal domains. To address multilingual acronym extraction we employed BiLSTM-CRF with multilingual XLM-RoBERTa embeddings. We pretrained the XLM-RoBERTa model on the shared task corpus to further adapt XLM-RoBERTa embeddings to the shared task domain(s). Our system (team: SMR-NLP) achieved competitive performance for acronym extraction across all the languages.

PDF Semantic Scholar

Cite

Text

Yaseen and Langer. "Domain Adaptive Pretraining for Multilingual Acronym Extraction." AAAI Conference on Artificial Intelligence, 2022. doi:10.48550/arxiv.2206.15221

Markdown

[Yaseen and Langer. "Domain Adaptive Pretraining for Multilingual Acronym Extraction." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/yaseen2022aaai-domain/) doi:10.48550/arxiv.2206.15221

BibTeX

@inproceedings{yaseen2022aaai-domain,
  title     = {{Domain Adaptive Pretraining for Multilingual Acronym Extraction}},
  author    = {Yaseen, Usama and Langer, Stefan},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  doi       = {10.48550/arxiv.2206.15221},
  url       = {https://mlanthology.org/aaai/2022/yaseen2022aaai-domain/}
}