Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)
Abstract
This paper discusses a new approach to the fundamental problem of learning optimal Q-functions. In this approach, optimal Q-functions are formulated as saddle points of a nonlinear Lagrangian function derived from the classic Bellman optimality equation. The paper shows that the Lagrangian enjoys strong duality, in spite of its nonlinearity, which paves the way to a general Lagrangian method to Q-function learning. As a demonstration, the paper develops an imitation learning algorithm based on the duality theory, and applies the algorithm to a state-of-the-art machine translation benchmark. The paper then turns to demonstrate a symmetry breaking phenomenon regarding the optimality of the Lagrangian saddle points, which justifies a largely overlooked direction in developing the Lagrangian method.
Cite
Text
Bojun. "Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)." International Conference on Machine Learning, 2022.Markdown
[Bojun. "Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)." International Conference on Machine Learning, 2022.](https://mlanthology.org/icml/2022/bojun2022icml-lagrangian/)BibTeX
@inproceedings{bojun2022icml-lagrangian,
title = {{Lagrangian Method for Q-Function Learning (with Applications to Machine Translation)}},
author = {Bojun, Huang},
booktitle = {International Conference on Machine Learning},
year = {2022},
pages = {2129-2159},
volume = {162},
url = {https://mlanthology.org/icml/2022/bojun2022icml-lagrangian/}
}