Compositional Generalization Through Gradient Search in Nonparametric Latent Space
Abstract
Many state-of-the-art methods in deep learning fail at systematic reasoning in settings which require compositional generalization. To address this, we propose a novel architecture which uses a nonparametric latent space, information-theoretic regularization of this space, and test-time gradient-based search to achieve strong performance on compositional meta-learning tasks such as program induction, Raven's progressive matrices, and linguistic systematicity tasks. Our proposed architecture, Abduction Transformer, uses nonparametric mixture distributions to represent inferred hidden causes of few-shot meta-learning instances. These representations are refined at test-time via gradient descent to better account for the observed few-shot examples, a form of variational posterior inference which allows Abduction Transformer to solve meta-learning tasks that require novel recombinations of knowledge acquired during training. Our method outperforms standard transformer architectures and a contemporary test-time adaptive variational approach, indicating a promising new direction for neural networks capable of systematic generalization.
Cite
Text
Shirakami and Henderson. "Compositional Generalization Through Gradient Search in Nonparametric Latent Space." International Conference on Learning Representations, 2026.Markdown
[Shirakami and Henderson. "Compositional Generalization Through Gradient Search in Nonparametric Latent Space." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/shirakami2026iclr-compositional/)BibTeX
@inproceedings{shirakami2026iclr-compositional,
title = {{Compositional Generalization Through Gradient Search in Nonparametric Latent Space}},
author = {Shirakami, Haruki and Henderson, James},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/shirakami2026iclr-compositional/}
}