Making Online Decisions with Bounded Memory

Abstract

We study the online decision problem in which there are T steps to play and n actions to choose. For this problem, several algorithms achieve an optimal regret of $O(\sqrt{T \ln n})$ , but they all require about T ^ n states, which one may not be able to afford when n and T are very large. We are interested in such large scale problems, and we would like to understand what an online algorithm can achieve with only a bounded number of states. We provide two algorithms, both with m ^ n  − 1 states, for a parameter m , which achieve regret of O ( m  + ( T / m )ln ( mn )) and $O(n \sqrt{m} +T/\sqrt{m})$ , respectively. We also show that any online algorithm with m ^ n  − 1 states must suffer a regret of Ω( T / m ), which is close to what our algorithms achieve.

Cite

Text

Lu and Lu. "Making Online Decisions with Bounded Memory." International Conference on Algorithmic Learning Theory, 2011. doi:10.1007/978-3-642-24412-4_21

Markdown

[Lu and Lu. "Making Online Decisions with Bounded Memory." International Conference on Algorithmic Learning Theory, 2011.](https://mlanthology.org/alt/2011/lu2011alt-making/) doi:10.1007/978-3-642-24412-4_21

BibTeX

@inproceedings{lu2011alt-making,
  title     = {{Making Online Decisions with Bounded Memory}},
  author    = {Lu, Chi-Jen and Lu, Wei-Fu},
  booktitle = {International Conference on Algorithmic Learning Theory},
  year      = {2011},
  pages     = {249-261},
  doi       = {10.1007/978-3-642-24412-4_21},
  url       = {https://mlanthology.org/alt/2011/lu2011alt-making/}
}