Learning Energy Networks with Generalized Fenchel-Young Losses
Abstract
Energy-based models, a.k.a. energy networks, perform inference by optimizing an energy function, typically parametrized by a neural network. This allows one to capture potentially complex relationships between inputs andoutputs.To learn the parameters of the energy function, the solution to thatoptimization problem is typically fed into a loss function.The key challenge for training energy networks lies in computing loss gradients,as this typically requires argmin/argmax differentiation.In this paper, building upon a generalized notion of conjugate function,which replaces the usual bilinear pairing with a general energy function,we propose generalized Fenchel-Young losses, a natural loss construction forlearning energy networks. Our losses enjoy many desirable properties and theirgradients can be computed efficiently without argmin/argmax differentiation.We also prove the calibration of their excess risk in the case of linear-concaveenergies. We demonstrate our losses on multilabel classification and imitation learning tasks.
Cite
Text
Blondel et al. "Learning Energy Networks with Generalized Fenchel-Young Losses." Neural Information Processing Systems, 2022.Markdown
[Blondel et al. "Learning Energy Networks with Generalized Fenchel-Young Losses." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/blondel2022neurips-learning/)BibTeX
@inproceedings{blondel2022neurips-learning,
title = {{Learning Energy Networks with Generalized Fenchel-Young Losses}},
author = {Blondel, Mathieu and Llinares-Lopez, Felipe and Dadashi, Robert and Hussenot, Leonard and Geist, Matthieu},
booktitle = {Neural Information Processing Systems},
year = {2022},
url = {https://mlanthology.org/neurips/2022/blondel2022neurips-learning/}
}