Batch Stationary Distribution Estimation

Abstract

We consider the problem of approximating the stationary distribution of an ergodic Markov chain given a set of sampled transitions. Classical simulation-based approaches assume access to the underlying process so that trajectories of sufficient length can be gathered to approximate stationary sampling. Instead, we consider an alternative setting where a \emph{fixed} set of transitions has been collected beforehand, by a separate, possibly unknown procedure. The goal is still to estimate properties of the stationary distribution, but without additional access to the underlying system. We propose a consistent estimator that is based on recovering a correction ratio function over the given data. In particular, we develop a variational power method (VPM) that provides provably consistent estimates under general conditions. In addition to unifying a number of existing approaches from different subfields, we also find that VPM yields significantly better estimates across a range of problems, including queueing, stochastic differential equations, post-processing MCMC, and off-policy evaluation.

Cite

Text

Wen et al. "Batch Stationary Distribution Estimation." International Conference on Machine Learning, 2020.

Markdown

[Wen et al. "Batch Stationary Distribution Estimation." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/wen2020icml-batch/)

BibTeX

@inproceedings{wen2020icml-batch,
  title     = {{Batch Stationary Distribution Estimation}},
  author    = {Wen, Junfeng and Dai, Bo and Li, Lihong and Schuurmans, Dale},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {10203-10213},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/wen2020icml-batch/}
}