Online Allocation and Homogeneous Partitioning for Piecewise Constant Mean-Approximation

Abstract

In the setting of active learning for the multi-armed bandit, where the goal of a learner is to estimate with equal precision the mean of a finite number of arms, recent results show that it is possible to derive strategies based on finite-time confidence bounds that are competitive with the best possible strategy. We here consider an extension of this problem to the case when the arms are the cells of a finite partition P of a continuous sampling space X \subset \Real^d. Our goal is now to build a piecewise constant approximation of a noisy function (where each piece is one region of P and P is fixed beforehand) in order to maintain the local quadratic error of approximation on each cell equally low. Although this extension is not trivial, we show that a simple algorithm based on upper confidence bounds can be proved to be adaptive to the function itself in a near-optimal way, when |P| is chosen to be of minimax-optimal order on the class of \alpha-Hölder functions.

Cite

Text

Carpentier and Maillard. "Online Allocation and Homogeneous Partitioning for Piecewise Constant Mean-Approximation." Neural Information Processing Systems, 2012.

Markdown

[Carpentier and Maillard. "Online Allocation and Homogeneous Partitioning for Piecewise Constant Mean-Approximation." Neural Information Processing Systems, 2012.](https://mlanthology.org/neurips/2012/carpentier2012neurips-online/)

BibTeX

@inproceedings{carpentier2012neurips-online,
  title     = {{Online Allocation and Homogeneous Partitioning for Piecewise Constant Mean-Approximation}},
  author    = {Carpentier, Alexandra and Maillard, Odalric-ambrym},
  booktitle = {Neural Information Processing Systems},
  year      = {2012},
  pages     = {1961-1969},
  url       = {https://mlanthology.org/neurips/2012/carpentier2012neurips-online/}
}