Learning When to Stop Thinking and Do Something!

Póczos, Barnabás; Abbasi-Yadkori, Yasin; Szepesvári, Csaba; Greiner, Russell; Sturtevant, Nathan R.

doi:10.1145/1553374.1553480

Learning When to Stop Thinking and Do Something!

Barnabás Póczos, Yasin Abbasi-Yadkori, Csaba Szepesvári, Russell Greiner, Nathan R. Sturtevant

ICML 2009 pp. 825-832

doi:10.1145/1553374.1553480 /icml/2009/poczos2009icml-learning/

Abstract

An anytime algorithm is capable of returning a response to the given task at essentially any time; typically the quality of the response improves as the time increases. Here, we consider the challenge of learning when we should terminate such algorithms on each of a sequence of iid tasks, to optimize the expected average reward per unit time. We provide an algorithm for answering this question. We combine the global optimizer Cross Entropy method and the local gradient ascent, and theoretically investigate how far the estimated gradient is from the true gradient. We empirically demonstrate the applicability of the proposed algorithm on a toy problem, as well as on a real-world face detection task.

PDF ICML Semantic Scholar

Cite

Text

Póczos et al. "Learning When to Stop Thinking and Do Something!." International Conference on Machine Learning, 2009. doi:10.1145/1553374.1553480

Markdown

[Póczos et al. "Learning When to Stop Thinking and Do Something!." International Conference on Machine Learning, 2009.](https://mlanthology.org/icml/2009/poczos2009icml-learning/) doi:10.1145/1553374.1553480

BibTeX

@inproceedings{poczos2009icml-learning,
  title     = {{Learning When to Stop Thinking and Do Something!}},
  author    = {Póczos, Barnabás and Abbasi-Yadkori, Yasin and Szepesvári, Csaba and Greiner, Russell and Sturtevant, Nathan R.},
  booktitle = {International Conference on Machine Learning},
  year      = {2009},
  pages     = {825-832},
  doi       = {10.1145/1553374.1553480},
  url       = {https://mlanthology.org/icml/2009/poczos2009icml-learning/}
}