Derivative-Free & Order-Robust Optimisation

Abstract

In this paper, we formalise order-robust optimisation as an instance of online learning minimising simple regret, and propose Vroom, a zero’th order optimisation algorithm capable of achieving vanishing regret in non-stationary environments, while recovering favorable rates under stochastic reward-generating processes. Our results are the first to target simple regret definitions in adversarial scenarios unveiling a challenge that has been rarely considered in prior work.

Cite

Text

Ammar et al. "Derivative-Free & Order-Robust Optimisation." Artificial Intelligence and Statistics, 2020.

Markdown

[Ammar et al. "Derivative-Free & Order-Robust Optimisation." Artificial Intelligence and Statistics, 2020.](https://mlanthology.org/aistats/2020/ammar2020aistats-derivativefree/)

BibTeX

@inproceedings{ammar2020aistats-derivativefree,
  title     = {{Derivative-Free & Order-Robust Optimisation}},
  author    = {Ammar, Haitham and Gabillon, Victor and Tutunov, Rasul and Valko, Michal},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2020},
  pages     = {2293-2303},
  volume    = {108},
  url       = {https://mlanthology.org/aistats/2020/ammar2020aistats-derivativefree/}
}