Boosting the Actor with Dual Critic

Abstract

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves the state-of-the-art performances across several benchmarks.

Cite

Text

Dai et al. "Boosting the Actor with Dual Critic." International Conference on Learning Representations, 2018.

Markdown

[Dai et al. "Boosting the Actor with Dual Critic." International Conference on Learning Representations, 2018.](https://mlanthology.org/iclr/2018/dai2018iclr-boosting/)

BibTeX

@inproceedings{dai2018iclr-boosting,
  title     = {{Boosting the Actor with Dual Critic}},
  author    = {Dai, Bo and Shaw, Albert and He, Niao and Li, Lihong and Song, Le},
  booktitle = {International Conference on Learning Representations},
  year      = {2018},
  url       = {https://mlanthology.org/iclr/2018/dai2018iclr-boosting/}
}