Pursuit-Evasion Without Regret, with an Application to Trading

Abstract

We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches.

Cite

Text

Dworkin et al. "Pursuit-Evasion Without Regret, with an Application to Trading." International Conference on Machine Learning, 2014.

Markdown

[Dworkin et al. "Pursuit-Evasion Without Regret, with an Application to Trading." International Conference on Machine Learning, 2014.](https://mlanthology.org/icml/2014/dworkin2014icml-pursuitevasion/)

BibTeX

@inproceedings{dworkin2014icml-pursuitevasion,
  title     = {{Pursuit-Evasion Without Regret, with an Application to Trading}},
  author    = {Dworkin, Lili and Kearns, Michael and Nevmyvaka, Yuriy},
  booktitle = {International Conference on Machine Learning},
  year      = {2014},
  pages     = {1521-1529},
  volume    = {32},
  url       = {https://mlanthology.org/icml/2014/dworkin2014icml-pursuitevasion/}
}