Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization

Abstract

We present adaptive algorithms with strong data-dependent regret guarantees for the problem of bandit convex optimization. In the process, we develop a general framework from which all previous main results in this setting can be recovered. The key method is the introduction of adaptive regularization. By appropriately adapting the exploration scheme, we show that one can derive regret guarantees which can be significantly more favorable than those previously known. Moreover, our analysis also modularizes the problematic quantities in achieving the conjectured minimax optimal rates in the most general setting of the problem.

Cite

Text

Yang and Mohri. "Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization." Conference on Uncertainty in Artificial Intelligence, 2016.

Markdown

[Yang and Mohri. "Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization." Conference on Uncertainty in Artificial Intelligence, 2016.](https://mlanthology.org/uai/2016/yang2016uai-adaptive/)

BibTeX

@inproceedings{yang2016uai-adaptive,
  title     = {{Adaptive Algorithms and Data-Dependent Guarantees for Bandit Convex Optimization}},
  author    = {Yang, Scott and Mohri, Mehryar},
  booktitle = {Conference on Uncertainty in Artificial Intelligence},
  year      = {2016},
  url       = {https://mlanthology.org/uai/2016/yang2016uai-adaptive/}
}