Reinforcement Learning for Trading
Abstract
We propose to train trading systems by optimizing financial objec(cid:173) tive functions via reinforcement learning. The performance func(cid:173) tions that we consider are profit or wealth, the Sharpe ratio and our recently proposed differential Sharpe ratio for online learn(cid:173) ing. In Moody & Wu (1997), we presented empirical results that demonstrate the advantages of reinforcement learning relative to supervised learning. Here we extend our previous work to com(cid:173) pare Q-Learning to our Recurrent Reinforcement Learning (RRL) algorithm. We provide new simulation results that demonstrate the presence of predictability in the monthly S&P 500 Stock Index for the 25 year period 1970 through 1994, as well as a sensitivity analysis that provides economic insight into the trader's structure.
Cite
Text
Moody and Saffell. "Reinforcement Learning for Trading." Neural Information Processing Systems, 1998.Markdown
[Moody and Saffell. "Reinforcement Learning for Trading." Neural Information Processing Systems, 1998.](https://mlanthology.org/neurips/1998/moody1998neurips-reinforcement/)BibTeX
@inproceedings{moody1998neurips-reinforcement,
title = {{Reinforcement Learning for Trading}},
author = {Moody, John E. and Saffell, Matthew},
booktitle = {Neural Information Processing Systems},
year = {1998},
pages = {917-923},
url = {https://mlanthology.org/neurips/1998/moody1998neurips-reinforcement/}
}