Adaptive Infinite Dropout for Noisy and Sparse Data Streams

Abstract

The ability to analyze data streams, which arrive sequentially and possibly infinitely, is increasingly vital in various online applications. However, data streams pose various challenges, including sparse and noisy data as well as concept drifts , which easily mislead a learning method. This paper proposes a simple yet robust framework, called Adaptive Infinite Dropout (aiDropout), to effectively tackle these problems. Our framework uses a dropout technique in a recursive Bayesian approach in order to create a flexible mechanism for balancing between old and new information. In detail, the recursive Bayesian approach imposes a constraint on the model parameters to make a regularization term between the current and previous mini-batches. Then, dropout whose drop rate is autonomously learned can adjust the constraint to new data. Thanks to the ability to reduce overfitting and the ensemble property of Dropout, our framework obtains better generalization, thus it effectively handles undesirable effects of noise and sparsity. In particular, theoretical analyses show that aiDropout imposes a data-dependent regularization, therefore, it can adapt quickly to sudden changes from data streams. Extensive experiments show that aiDropout significantly outperforms the state-of-the-art baselines on a variety of tasks such as supervised and unsupervised learning.

Cite

Text

Nguyen et al. "Adaptive Infinite Dropout for Noisy and Sparse Data Streams." Machine Learning, 2022. doi:10.1007/S10994-022-06169-W

Markdown

[Nguyen et al. "Adaptive Infinite Dropout for Noisy and Sparse Data Streams." Machine Learning, 2022.](https://mlanthology.org/mlj/2022/nguyen2022mlj-adaptive/) doi:10.1007/S10994-022-06169-W

BibTeX

@article{nguyen2022mlj-adaptive,
  title     = {{Adaptive Infinite Dropout for Noisy and Sparse Data Streams}},
  author    = {Nguyen, Ha and Pham, Hoang and Nguyen, Son and Van Linh, Ngo and Than, Khoat},
  journal   = {Machine Learning},
  year      = {2022},
  pages     = {3025-3060},
  doi       = {10.1007/S10994-022-06169-W},
  volume    = {111},
  url       = {https://mlanthology.org/mlj/2022/nguyen2022mlj-adaptive/}
}