LOCO: Adaptive Exploration in Reinforcement Learning via Local Estimation of Contraction Coefficients
Abstract
We offer a novel approach to balance exploration and exploitation in reinforcement learning (RL). To do so, we characterize an environment’s exploration difficulty via the Second Largest Eigenvalue Modulus (SLEM) of the Markov chain induced by uniform stochastic behaviour. Specifically, we investigate the connection of state-space coverage with the SLEM of this Markov chain and use the theory of contraction coefficients to derive estimates of this eigenvalue of interest. Furthermore, we introduce a method for estimating the contraction coefficients on a local level and leverage it to design a novel exploration algorithm. We evaluate our algorithm on a series of GridWorld tasks of varying sizes and complexity.
Cite
Text
Anonymous. "LOCO: Adaptive Exploration in Reinforcement Learning via Local Estimation of Contraction Coefficients." ICLR 2021 Workshops: SSL-RL, 2021.Markdown
[Anonymous. "LOCO: Adaptive Exploration in Reinforcement Learning via Local Estimation of Contraction Coefficients." ICLR 2021 Workshops: SSL-RL, 2021.](https://mlanthology.org/iclrw/2021/anonymous2021iclrw-loco/)BibTeX
@inproceedings{anonymous2021iclrw-loco,
title = {{LOCO: Adaptive Exploration in Reinforcement Learning via Local Estimation of Contraction Coefficients}},
author = {Anonymous, },
booktitle = {ICLR 2021 Workshops: SSL-RL},
year = {2021},
url = {https://mlanthology.org/iclrw/2021/anonymous2021iclrw-loco/}
}