On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets

Abstract

Different distribution shifts require different algorithmic and operational interventions. Methodological research must be grounded by the specific shifts they address. Although nascent benchmarks provide a promising empirical foundation, they \emph{implicitly} focus on covariate shifts, and the validity of empirical findings depends on the type of shift, e.g., previous observations on algorithmic performance can fail to be valid when the $Y|X$ distribution changes. We conduct a thorough investigation of natural shifts in 5 tabular datasets over 86,000 model configurations, and find that $Y|X$-shifts are most prevalent. To encourage researchers to develop a refined language for distribution shifts, we build ``WhyShift``, an empirical testbed of curated real-world shifts where we characterize the type of shift we benchmark performance over. Since $Y|X$-shifts are prevalent in tabular settings, we \emph{identify covariate regions} that suffer the biggest $Y|X$-shifts and discuss implications for algorithmic and data-based interventions. Our testbed highlights the importance of future research that builds an understanding of why distributions differ.

Cite

Text

Liu et al. "On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets." Neural Information Processing Systems, 2023.

Markdown

[Liu et al. "On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets." Neural Information Processing Systems, 2023.](https://mlanthology.org/neurips/2023/liu2023neurips-need/)

BibTeX

@inproceedings{liu2023neurips-need,
  title     = {{On the Need for a Language Describing Distribution Shifts: Illustrations on Tabular Datasets}},
  author    = {Liu, Jiashuo and Wang, Tianyu and Cui, Peng and Namkoong, Hongseok},
  booktitle = {Neural Information Processing Systems},
  year      = {2023},
  url       = {https://mlanthology.org/neurips/2023/liu2023neurips-need/}
}