Efficient Contextual Bandits with Continuous Actions

Abstract

We create a computationally tractable learning algorithm for contextual bandits with continuous actions having unknown structure. The new reduction-style algorithm composes with most supervised learning representations. We prove that this algorithm works in a general sense and verify the new functionality with large-scale experiments.

Cite

Text

Majzoubi et al. "Efficient Contextual Bandits with Continuous Actions." Neural Information Processing Systems, 2020.

Markdown

[Majzoubi et al. "Efficient Contextual Bandits with Continuous Actions." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/majzoubi2020neurips-efficient/)

BibTeX

@inproceedings{majzoubi2020neurips-efficient,
  title     = {{Efficient Contextual Bandits with Continuous Actions}},
  author    = {Majzoubi, Maryam and Zhang, Chicheng and Chari, Rajan and Krishnamurthy, Akshay and Langford, John and Slivkins, Aleksandrs},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/majzoubi2020neurips-efficient/}
}