Personalizing Many Decisions with High-Dimensional Covariates

NeurIPS 2019 pp. 11473-11484

/neurips/2019/hamidi2019neurips-personalizing/

Abstract

We consider the k-armed stochastic contextual bandit problem with d dimensional features, when both k and d can be large. To the best of our knowledge, all existing algorithm for this problem have a regret bound that scale as polynomials of degree at least two in k and d. The main contribution of this paper is to introduce and theoretically analyze a new algorithm (REAL Bandit) with a regret that scales by r^2(k+d) when r is rank of the k by d matrix of unknown parameters. REAL Bandit relies on ideas from low-rank matrix estimation literature and a new row-enhancement subroutine that yields sharper bounds for estimating each row of the parameter matrix that may be of independent interest.

PDF NeurIPS Semantic Scholar

Cite

Text

Hamidi et al. "Personalizing Many Decisions with High-Dimensional Covariates." Neural Information Processing Systems, 2019.

Markdown

[Hamidi et al. "Personalizing Many Decisions with High-Dimensional Covariates." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/hamidi2019neurips-personalizing/)

BibTeX

@inproceedings{hamidi2019neurips-personalizing,
  title     = {{Personalizing Many Decisions with High-Dimensional Covariates}},
  author    = {Hamidi, Nima and Bayati, Mohsen and Gupta, Kapil},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {11473-11484},
  url       = {https://mlanthology.org/neurips/2019/hamidi2019neurips-personalizing/}
}