Efficient Contextual Bandits with Continuous Actions
Abstract
We create a computationally tractable learning algorithm for contextual bandits with continuous actions having unknown structure. The new reduction-style algorithm composes with most supervised learning representations. We prove that this algorithm works in a general sense and verify the new functionality with large-scale experiments.
Cite
Text
Majzoubi et al. "Efficient Contextual Bandits with Continuous Actions." Neural Information Processing Systems, 2020.Markdown
[Majzoubi et al. "Efficient Contextual Bandits with Continuous Actions." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/majzoubi2020neurips-efficient/)BibTeX
@inproceedings{majzoubi2020neurips-efficient,
title = {{Efficient Contextual Bandits with Continuous Actions}},
author = {Majzoubi, Maryam and Zhang, Chicheng and Chari, Rajan and Krishnamurthy, Akshay and Langford, John and Slivkins, Aleksandrs},
booktitle = {Neural Information Processing Systems},
year = {2020},
url = {https://mlanthology.org/neurips/2020/majzoubi2020neurips-efficient/}
}