The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks

Heger, Matthias

doi:10.1023/A:1018016523433

The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks

Matthias Heger

MLJ 1996 pp. 197-225

doi:10.1023/A:1018016523433 /mlj/1996/heger1996mlj-loss/

Abstract

Many reinforcement learning (RL) algorithms approximate an optimal value function. Once the function is known, it is easy to determine an optimal policy. For most real-world applications, however, the value function is too complex to be represented by lookup tables, making it necessary to use function approximators such as neural networks. In this case, convergence to the optimal value function is no longer guaranteed and it becomes important to know to which extent performance diminishes when one uses approximate value functions instead of optimal ones. This problem has recently been discussed in the context of expectation-based Markov decision problems. Our analysis generalizes this work to minimax-based Markov decision problems, yields new results for expectation-based tasks, and shows how minimax-based and expectation-based Markov decision problems relate.

PDF MLJ Semantic Scholar

Cite

Text

Heger. "The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks." Machine Learning, 1996. doi:10.1023/A:1018016523433

Markdown

[Heger. "The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks." Machine Learning, 1996.](https://mlanthology.org/mlj/1996/heger1996mlj-loss/) doi:10.1023/A:1018016523433

BibTeX

@article{heger1996mlj-loss,
  title     = {{The Loss from Imperfect Value Functions in Expectation-Based and Minimax-Based Tasks}},
  author    = {Heger, Matthias},
  journal   = {Machine Learning},
  year      = {1996},
  pages     = {197-225},
  doi       = {10.1023/A:1018016523433},
  volume    = {22},
  url       = {https://mlanthology.org/mlj/1996/heger1996mlj-loss/}
}