DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion

Abstract

Real-world data generation often involves complex inter-dependencies among instances, violating the IID-data hypothesis of standard learning paradigms and posing a challenge for uncovering the geometric structures for learning desired instance representations. To this end, we introduce an energy constrained diffusion model which encodes a batch of instances from a dataset into evolutionary states that progressively incorporate other instances' information by their interactions. The diffusion process is constrained by descent criteria w.r.t. a principled energy function that characterizes the global consistency of instance representations over latent structures. We provide rigorous theory that implies closed-form optimal estimates for the pairwise diffusion strength among arbitrary instance pairs, which gives rise to a new class of neural encoders, dubbed as DIFFormer (diffusion-based Transformers), with two instantiations: a simple version with linear complexity for prohibitive instance numbers, and an advanced version for learning complex structures. Experiments highlight the wide applicability of our model as a general-purpose encoder backbone with superior performance in various tasks, such as node classification on large graphs, semi-supervised image/text classification, and spatial-temporal dynamics prediction. The codes are available at https://github.com/qitianwu/DIFFormer.

Cite

Text

Wu et al. "DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion." International Conference on Learning Representations, 2023.

Markdown

[Wu et al. "DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion." International Conference on Learning Representations, 2023.](https://mlanthology.org/iclr/2023/wu2023iclr-difformer/)

BibTeX

@inproceedings{wu2023iclr-difformer,
  title     = {{DIFFormer: Scalable (Graph) Transformers Induced by Energy Constrained Diffusion}},
  author    = {Wu, Qitian and Yang, Chenxiao and Zhao, Wentao and He, Yixuan and Wipf, David and Yan, Junchi},
  booktitle = {International Conference on Learning Representations},
  year      = {2023},
  url       = {https://mlanthology.org/iclr/2023/wu2023iclr-difformer/}
}