Making Online Decisions with Bounded Memory
Abstract
We study the online decision problem in which there are T steps to play and n actions to choose. For this problem, several algorithms achieve an optimal regret of $O(\sqrt{T \ln n})$ , but they all require about T ^ n states, which one may not be able to afford when n and T are very large. We are interested in such large scale problems, and we would like to understand what an online algorithm can achieve with only a bounded number of states. We provide two algorithms, both with m ^ n − 1 states, for a parameter m , which achieve regret of O ( m + ( T / m )ln ( mn )) and $O(n \sqrt{m} +T/\sqrt{m})$ , respectively. We also show that any online algorithm with m ^ n − 1 states must suffer a regret of Ω( T / m ), which is close to what our algorithms achieve.
Cite
Text
Lu and Lu. "Making Online Decisions with Bounded Memory." International Conference on Algorithmic Learning Theory, 2011. doi:10.1007/978-3-642-24412-4_21Markdown
[Lu and Lu. "Making Online Decisions with Bounded Memory." International Conference on Algorithmic Learning Theory, 2011.](https://mlanthology.org/alt/2011/lu2011alt-making/) doi:10.1007/978-3-642-24412-4_21BibTeX
@inproceedings{lu2011alt-making,
title = {{Making Online Decisions with Bounded Memory}},
author = {Lu, Chi-Jen and Lu, Wei-Fu},
booktitle = {International Conference on Algorithmic Learning Theory},
year = {2011},
pages = {249-261},
doi = {10.1007/978-3-642-24412-4_21},
url = {https://mlanthology.org/alt/2011/lu2011alt-making/}
}