Learning Energy Networks with Generalized Fenchel-Young Losses

Abstract

Energy-based models, a.k.a. energy networks, perform inference by optimizing an energy function, typically parametrized by a neural network. This allows one to capture potentially complex relationships between inputs andoutputs.To learn the parameters of the energy function, the solution to thatoptimization problem is typically fed into a loss function.The key challenge for training energy networks lies in computing loss gradients,as this typically requires argmin/argmax differentiation.In this paper, building upon a generalized notion of conjugate function,which replaces the usual bilinear pairing with a general energy function,we propose generalized Fenchel-Young losses, a natural loss construction forlearning energy networks. Our losses enjoy many desirable properties and theirgradients can be computed efficiently without argmin/argmax differentiation.We also prove the calibration of their excess risk in the case of linear-concaveenergies. We demonstrate our losses on multilabel classification and imitation learning tasks.

Cite

Text

Blondel et al. "Learning Energy Networks with Generalized Fenchel-Young Losses." Neural Information Processing Systems, 2022.

Markdown

[Blondel et al. "Learning Energy Networks with Generalized Fenchel-Young Losses." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/blondel2022neurips-learning/)

BibTeX

@inproceedings{blondel2022neurips-learning,
  title     = {{Learning Energy Networks with Generalized Fenchel-Young Losses}},
  author    = {Blondel, Mathieu and Llinares-Lopez, Felipe and Dadashi, Robert and Hussenot, Leonard and Geist, Matthieu},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/blondel2022neurips-learning/}
}