Convex Reinforcement Learning in Finite Trials
Abstract
Convex Reinforcement Learning (RL) is a recently introduced framework that generalizes the standard RL objective to any convex (or concave) function of the state distribution induced by the agent's policy. This framework subsumes several applications of practical interest, such as pure exploration, imitation learning, and risk-averse RL, among others. However, the previous convex RL literature implicitly evaluates the agent's performance over infinite realizations (or trials), while most of the applications require excellent performance over a handful, or even just one, trials. To meet this practical demand, we formulate convex RL in finite trials, where the objective is any convex function of the empirical state distribution computed over a finite number of realizations. In this paper, we provide a comprehensive theoretical study of the setting, which includes an analysis of the importance of non-Markovian policies to achieve optimality, as well as a characterization of the computational and statistical complexity of the problem in various configurations.
Cite
Text
Mutti et al. "Convex Reinforcement Learning in Finite Trials." Journal of Machine Learning Research, 2023.Markdown
[Mutti et al. "Convex Reinforcement Learning in Finite Trials." Journal of Machine Learning Research, 2023.](https://mlanthology.org/jmlr/2023/mutti2023jmlr-convex/)BibTeX
@article{mutti2023jmlr-convex,
title = {{Convex Reinforcement Learning in Finite Trials}},
author = {Mutti, Mirco and De Santi, Riccardo and De Bartolomeis, Piersilvio and Restelli, Marcello},
journal = {Journal of Machine Learning Research},
year = {2023},
pages = {1-42},
volume = {24},
url = {https://mlanthology.org/jmlr/2023/mutti2023jmlr-convex/}
}