A Survey on Out-of-Distribution Detection in NLP
Abstract
Out-of-distribution (OOD) detection is essential for the reliable and safe deployment of machine learning systems in the real world. Great progress has been made over the past years. This paper presents the first review of recent advances in OOD detection with a particular focus on natural language processing approaches. First, we provide a formal definition of OOD detection and discuss several related fields. We then categorize recent algorithms into three classes according to the data they used: (1) OOD data available, (2) OOD data unavailable + in-distribution (ID) label available, and (3) OOD data unavailable + ID label unavailable. Third, we introduce datasets, applications, and metrics. Finally, we summarize existing work and present potential future research topics.
Cite
Text
Lang et al. "A Survey on Out-of-Distribution Detection in NLP." Transactions on Machine Learning Research, 2024.Markdown
[Lang et al. "A Survey on Out-of-Distribution Detection in NLP." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/lang2024tmlr-survey/)BibTeX
@article{lang2024tmlr-survey,
title = {{A Survey on Out-of-Distribution Detection in NLP}},
author = {Lang, Hao and Zheng, Yinhe and Li, Yixuan and Sun, Jian and Huang, Fei and Li, Yongbin},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/lang2024tmlr-survey/}
}