Scaling Graphically Structured Diffusion Models
Abstract
Applications of the recently introduced graphically structured diffusion model (GSDM) family show that sparsifying the transformer attention mechanism within a diffusion model and meta-training on a variety of conditioning tasks can yield an efficiently learnable diffusion model artifact that is capable of flexible, in the sense of observing different subsets of variables at test-time, amortized conditioning in probabilistic graphical models. While extremely promising in terms of applicability and utility, implementations of GSDMs prior to this work were not scalable beyond toy graphical model sizes. We overcome this limitation by describing and and solving two scaling issues related to GSDMs; one engineering and one methodological. We additionally propose a new benchmark problem of weight inference for a convolutional neural network applied to $14\times14$ MNIST.
Cite
Text
Weilbach et al. "Scaling Graphically Structured Diffusion Models." ICML 2023 Workshops: SPIGM, 2023.Markdown
[Weilbach et al. "Scaling Graphically Structured Diffusion Models." ICML 2023 Workshops: SPIGM, 2023.](https://mlanthology.org/icmlw/2023/weilbach2023icmlw-scaling/)BibTeX
@inproceedings{weilbach2023icmlw-scaling,
title = {{Scaling Graphically Structured Diffusion Models}},
author = {Weilbach, Christian Dietrich and Harvey, William and Shirzad, Hamed and Wood, Frank},
booktitle = {ICML 2023 Workshops: SPIGM},
year = {2023},
url = {https://mlanthology.org/icmlw/2023/weilbach2023icmlw-scaling/}
}