Actor-Critics Can Achieve Optimal Sample Efficiency
Abstract
Actor-critic algorithms have become a cornerstone in reinforcement learning (RL), leveraging the strengths of both policy-based and value-based methods. Despite recent progress in understanding their statistical efficiency, no existing work has successfully learned an $\epsilon$-optimal policy with a sample complexity of $O(1/\epsilon^2)$ trajectories with general function approximation when strategic exploration is necessary. We address this open problem by introducing a novel actor-critic algorithm that attains a sample-complexity of $O(dH^5 \log|\mathcal{A}|/\epsilon^2 + d H^4 \log|\mathcal{F}|/ \epsilon^2)$ trajectories, and accompanying $\sqrt{T}$ regret when the Bellman eluder dimension $d$ does not increase with $T$ at more than a $\log T$ rate. Here, $\mathcal{F}$ is the critic function class, and $\mathcal{A}$ is the action space. Our algorithm integrates optimism, off-policy critic estimation targeting the optimal Q-function, and rare-switching policy resets. We extend this to the setting of Hybrid RL, where we show that initializing the critic with offline data yields sample efficiency gains, and also provide a non-optimistic provably efficient actor-critic algorithm, addressing another open problem in the literature. Numerical experiments support our theoretical findings.
Cite
Text
Tan et al. "Actor-Critics Can Achieve Optimal Sample Efficiency." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Tan et al. "Actor-Critics Can Achieve Optimal Sample Efficiency." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/tan2025icml-actorcritics/)BibTeX
@inproceedings{tan2025icml-actorcritics,
title = {{Actor-Critics Can Achieve Optimal Sample Efficiency}},
author = {Tan, Kevin and Fan, Wei and Wei, Yuting},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {58545-58590},
volume = {267},
url = {https://mlanthology.org/icml/2025/tan2025icml-actorcritics/}
}