Pursuit-Evasion Without Regret, with an Application to Trading
Abstract
We propose a state-based variant of the classical online learning problem of tracking the best expert. In our setting, the actions of the algorithm and experts correspond to local moves through a continuous and bounded state space. At each step, Nature chooses payoffs as a function of each player’s current position and action. Our model therefore integrates the problem of prediction with expert advice with the stateful formalisms of reinforcement learning. Traditional no-regret learning approaches no longer apply, but we propose a simple algorithm that provably achieves no-regret when the state space is any convex Euclidean region. Our algorithm combines techniques from online learning with results from the literature on pursuit-evasion games. We describe a quantitative trading application in which the convex region captures inventory risk constraints, and local moves limit market impact. Using historical market data, we show experimentally that our algorithm has a strong advantage over classic no-regret approaches.
Cite
Text
Dworkin et al. "Pursuit-Evasion Without Regret, with an Application to Trading." International Conference on Machine Learning, 2014.Markdown
[Dworkin et al. "Pursuit-Evasion Without Regret, with an Application to Trading." International Conference on Machine Learning, 2014.](https://mlanthology.org/icml/2014/dworkin2014icml-pursuitevasion/)BibTeX
@inproceedings{dworkin2014icml-pursuitevasion,
title = {{Pursuit-Evasion Without Regret, with an Application to Trading}},
author = {Dworkin, Lili and Kearns, Michael and Nevmyvaka, Yuriy},
booktitle = {International Conference on Machine Learning},
year = {2014},
pages = {1521-1529},
volume = {32},
url = {https://mlanthology.org/icml/2014/dworkin2014icml-pursuitevasion/}
}