Order Matters in the Presence of Dataset Imbalance for Multilingual Learning

Abstract

In this paper, we empirically study the optimization dynamics of multi-task learning, particularly focusing on those that govern a collection of tasks with significant data imbalance. We present a simple yet effective method of pre-training on high-resource tasks, followed by fine-tuning on a mixture of high/low-resource tasks. We provide a thorough empirical study and analysis of this method's benefits showing that it achieves consistent improvements relative to the performance trade-off profile of standard static weighting. We analyze under what data regimes this method is applicable and show its improvements empirically in neural machine translation (NMT) and multi-lingual language modeling.

Cite

Text

Choi et al. "Order Matters in the Presence of Dataset Imbalance for Multilingual Learning." Neural Information Processing Systems, 2023.

Markdown

[Choi et al. "Order Matters in the Presence of Dataset Imbalance for Multilingual Learning." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/choi2023neurips-order/)

BibTeX

@inproceedings{choi2023neurips-order,
  title     = {{Order Matters in the Presence of Dataset Imbalance for Multilingual Learning}},
  author    = {Choi, Dami and Xin, Derrick and Dadkhahi, Hamid and Gilmer, Justin and Garg, Ankush and Firat, Orhan and Yeh, Chih-Kuan and Dai, Andrew M and Ghorbani, Behrooz},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/choi2023neurips-order/}
}