Bias and Variance in Value Function Estimation

Abstract

We consider the bias and variance of value function estimation that are caused by using an empirical model instead of the true model. We analyze these bias and variance for Markov processes from a classical (frequentist) statistical point of view, and in a Bayesiansetting. Using a second order approximation, we provide explicit expressionsfor the bias and variance in terms of the transition counts and the rewardstatistics. We present supporting experiments with artificial Markov chains and with a large transactional database provided by a mail-order catalog firm.

Cite

Text

Mannor et al. "Bias and Variance in Value Function Estimation." International Conference on Machine Learning, 2004. doi:10.1145/1015330.1015402

Markdown

[Mannor et al. "Bias and Variance in Value Function Estimation." International Conference on Machine Learning, 2004.](https://mlanthology.org/icml/2004/mannor2004icml-bias/) doi:10.1145/1015330.1015402

BibTeX

@inproceedings{mannor2004icml-bias,
  title     = {{Bias and Variance in Value Function Estimation}},
  author    = {Mannor, Shie and Simester, Duncan and Sun, Peng and Tsitsiklis, John N.},
  booktitle = {International Conference on Machine Learning},
  year      = {2004},
  doi       = {10.1145/1015330.1015402},
  url       = {https://mlanthology.org/icml/2004/mannor2004icml-bias/}
}