In-Context Learning of Energy Functions

Abstract

In-context learning is a powerful capability of certain machine learning models that arguably underpins the success of today's frontier AI models. However, in-context learning is critically limited to settings where the in-context distribution of interest $p_{\theta}^{ICL}(x|\mathcal{D})$ can be straightforwardly expressed and/or parameterized by the model; for instance, language modeling relies on expressing the next-token distribution as a categorical distribution parameterized by the network's output logits. In this work, we present a more general form of in-context learning without such a limitation that we call \textit{in-context learning of energy functions}. The idea is to instead learn the unconstrained and arbitrary in-context energy function $E_{\theta}^{ICL}(x|\mathcal{D})$ corresponding to the in-context distribution $p_{\theta}^{ICL}(x|\mathcal{D})$. To do this, we use classic ideas from energy-based modeling. We provide preliminary evidence that our method empirically works on synthetic data. Interestingly, our work contributes (to the best of our knowledge) the first example of in-context learning where the input space and output space differ from one another, suggesting that in-context learning is a more-general capability than previously realized.

Cite

Text

Schaeffer et al. "In-Context Learning of Energy Functions." ICML 2024 Workshops: ICL, 2024.

Markdown

[Schaeffer et al. "In-Context Learning of Energy Functions." ICML 2024 Workshops: ICL, 2024.](https://mlanthology.org/icmlw/2024/schaeffer2024icmlw-incontext/)

BibTeX

@inproceedings{schaeffer2024icmlw-incontext,
  title     = {{In-Context Learning of Energy Functions}},
  author    = {Schaeffer, Rylan and Khona, Mikail and Koyejo, Sanmi},
  booktitle = {ICML 2024 Workshops: ICL},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/schaeffer2024icmlw-incontext/}
}