Derivative-Free & Order-Robust Optimisation
Abstract
In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero’th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.
Cite
Text
Ammar et al. "Derivative-Free & Order-Robust Optimisation." Artificial Intelligence and Statistics, 2020.Markdown
[Ammar et al. "Derivative-Free & Order-Robust Optimisation." Artificial Intelligence and Statistics, 2020.](https://mlanthology.org/aistats/2020/ammar2020aistats-derivativefree/)BibTeX
@inproceedings{ammar2020aistats-derivativefree,
title = {{Derivative-Free & Order-Robust Optimisation}},
author = {Ammar, Haitham and Gabillon, Victor and Tutunov, Rasul and Valko, Michal},
booktitle = {Artificial Intelligence and Statistics},
year = {2020},
pages = {2293-2303},
volume = {108},
url = {https://mlanthology.org/aistats/2020/ammar2020aistats-derivativefree/}
}