Evolving Domain Adaptation of Pretrained Language Models for Text Classification

Abstract

Pre-trained language models have shown impressive performance in various text classification tasks. However, the performance of these models is highly dependent on the quality and domain of the labeled examples. In dynamic real-world environments, text data content naturally evolves over time, leading to a natural $\textit{evolving domain shift}$. Over time, this continuous temporal shift impairs the performance of static models, as their training becomes increasingly outdated. To address this issue, we propose two dynamic buffer-based adaptation strategies: one utilizes self-training with pseudo-labeling, and the other employs a tuning-free, in-context learning approach for large language models (LLMs). We validate our methods with extensive experiments on two longitudinal real-world social media datasets, demonstrating their superiority compared to unadapted baselines. Furthermore, we introduce a COVID-19 vaccination stance detection dataset, serving as a benchmark for evaluating pre-trained language models within evolving domain adaptation settings.

Cite

Text

Chuang et al. "Evolving Domain Adaptation of Pretrained Language Models for Text Classification." NeurIPS 2023 Workshops: DistShift, 2023.

Markdown

[Chuang et al. "Evolving Domain Adaptation of Pretrained Language Models for Text Classification." NeurIPS 2023 Workshops: DistShift, 2023.](https://mlanthology.org/neuripsw/2023/chuang2023neuripsw-evolving/)

BibTeX

@inproceedings{chuang2023neuripsw-evolving,
  title     = {{Evolving Domain Adaptation of Pretrained Language Models for Text Classification}},
  author    = {Chuang, Yun-Shiuan and Uppaal, Rheeya and Wu, Yi and Sun, Luhang and Sreedhar, Makesh Narsimhan and Yang, Sijia and Rogers, Timothy T. and Hu, Junjie},
  booktitle = {NeurIPS 2023 Workshops: DistShift},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/chuang2023neuripsw-evolving/}
}