Hidden Learning Dynamics of Capability Before Behavior in Diffusion Models

Abstract

Understanding how multimodal models generalize out of distribution is a fundamental challenge in machine learning. Compositional generalization explains this by assuming the model learns concepts and how to compose them. In this work, we train diffusion models on a compositional task from synthetic data of objects of different size and colors. We introduce a concept space as a framework to understand the learning dynamics of compositional generalization. In this framework, we identify $\textit{concept signal}$ as a driver of compositional generalization. Next, we find that diffusion models can acquire the $\textit{capability}$ to compositionally generalize long before it elicits this $\textit{behavior}$. Additionally, we find that the time of capability learning can be pinpointed from the concept space learning dynamics. Finally, we suggest a $\textit{embedding disentanglement}$ as another metric to probe the capability of a model. Overall, we make a step in understanding the emergence of compositional capabilities in diffusion models.

Cite

Text

Park et al. "Hidden Learning Dynamics of Capability Before Behavior in Diffusion Models." ICML 2024 Workshops: HiLD, 2024.

Markdown

[Park et al. "Hidden Learning Dynamics of Capability Before Behavior in Diffusion Models." ICML 2024 Workshops: HiLD, 2024.](https://mlanthology.org/icmlw/2024/park2024icmlw-hidden/)

BibTeX

@inproceedings{park2024icmlw-hidden,
  title     = {{Hidden Learning Dynamics of Capability Before Behavior in Diffusion Models}},
  author    = {Park, Core Francisco and Okawa, Maya and Lee, Andrew and Lubana, Ekdeep Singh and Tanaka, Hidenori},
  booktitle = {ICML 2024 Workshops: HiLD},
  year      = {2024},
  url       = {https://mlanthology.org/icmlw/2024/park2024icmlw-hidden/}
}