Quantifying Uncertainty in Online Regression Forests

Abstract

Accurately quantifying uncertainty in predictions is essential for the deployment of machine learning algorithms in critical applications where mistakes are costly. Most approaches to quantifying prediction uncertainty have focused on settings where the data is static, or bounded. In this paper, we investigate methods that quantify the prediction uncertainty in a streaming setting, where the data is potentially unbounded. We propose two meta-algorithms that produce prediction intervals for online regression forests of arbitrary tree models; one based on conformal prediction, and the other based on quantile regression. We show that the approaches are able to maintain specified error rates, with constant computational cost per example and bounded memory usage. We provide empirical evidence that the methods outperform the state-of-the-art in terms of maintaining error guarantees, while being an order of magnitude faster. We also investigate how the algorithms are able to recover from concept drift.

Cite

Text

Vasiloudis et al. "Quantifying Uncertainty in Online Regression Forests." Journal of Machine Learning Research, 2019.

Markdown

[Vasiloudis et al. "Quantifying Uncertainty in Online Regression Forests." Journal of Machine Learning Research, 2019.](https://mlanthology.org/jmlr/2019/vasiloudis2019jmlr-quantifying/)

BibTeX

@article{vasiloudis2019jmlr-quantifying,
  title     = {{Quantifying Uncertainty in Online Regression Forests}},
  author    = {Vasiloudis, Theodore and Morales, Gianmarco De Francisci and Boström, Henrik},
  journal   = {Journal of Machine Learning Research},
  year      = {2019},
  pages     = {1-35},
  volume    = {20},
  url       = {https://mlanthology.org/jmlr/2019/vasiloudis2019jmlr-quantifying/}
}