Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization

Abstract

Real-world large-scale datasets are heteroskedastic and imbalanced --- labels have varying levels of uncertainty and label distributions are long-tailed. Heteroskedasticity and imbalance challenge deep learning algorithms due to the difficulty of distinguishing among mislabeled, ambiguous, and rare examples. Addressing heteroskedasticity and imbalance simultaneously is under-explored. We propose a data-dependent regularization technique for heteroskedastic datasets that regularizes different regions of the input space differently. Inspired by the theoretical derivation of the optimal regularization strength in a one-dimensional nonparametric classification setting, our approach adaptively regularizes the data points in higher-uncertainty, lower-density regions more heavily. We test our method on several benchmark tasks, including a real-world heteroskedastic and imbalanced dataset, WebVision. Our experiments corroborate our theory and demonstrate a significant improvement over other methods in noise-robust deep learning.

Cite

Text

Cao et al. "Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization." International Conference on Learning Representations, 2021.

Markdown

[Cao et al. "Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization." International Conference on Learning Representations, 2021.](https://mlanthology.org/iclr/2021/cao2021iclr-heteroskedastic/)

BibTeX

@inproceedings{cao2021iclr-heteroskedastic,
  title     = {{Heteroskedastic and Imbalanced Deep Learning with Adaptive Regularization}},
  author    = {Cao, Kaidi and Chen, Yining and Lu, Junwei and Arechiga, Nikos and Gaidon, Adrien and Ma, Tengyu},
  booktitle = {International Conference on Learning Representations},
  year      = {2021},
  url       = {https://mlanthology.org/iclr/2021/cao2021iclr-heteroskedastic/}
}