In Defense of Zero Imputation for Tabular Deep Learning

Abstract

Missing values are a common problem in many supervised learning contexts. While a wealth of literature exists related to missing value imputation, less literature has focused on the impact of imputation on downstream supervised learning. Recently, impute-then-predict neural networks have been proposed as a powerful solution to this problem, allowing for joint optimization of imputations and predictions. In this paper, we illustrate a somewhat surprising result: multi-layer perceptrons (MLPs) paired with zero imputation perform as well as more powerful deep impute-then-predict models on real-world data. To support this finding, we analyze the results of various deep impute-then-predict models to better understand why they fail to outperform zero imputation. Our analysis sheds light onto the difficulties of imputation in real-world contexts, and highlights the utility of zero imputation for tabular deep learning.

Cite

Text

Van Ness and Udell. "In Defense of Zero Imputation for Tabular Deep Learning." NeurIPS 2023 Workshops: TRL, 2023.

Markdown

[Van Ness and Udell. "In Defense of Zero Imputation for Tabular Deep Learning." NeurIPS 2023 Workshops: TRL, 2023.](https://mlanthology.org/neuripsw/2023/ness2023neuripsw-defense/)

BibTeX

@inproceedings{ness2023neuripsw-defense,
  title     = {{In Defense of Zero Imputation for Tabular Deep Learning}},
  author    = {Van Ness, Mike and Udell, Madeleine},
  booktitle = {NeurIPS 2023 Workshops: TRL},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/ness2023neuripsw-defense/}
}