Multi-Architecture Multi-Expert Diffusion Models
Abstract
In this paper, we address the performance degradation of efficient diffusion models by introducing Multi-architecturE Multi-Expert diffusion models (MEME). We identify the need for tailored operations at different time-steps in diffusion processes and leverage this insight to create compact yet high-performing models. MEME assigns distinct architectures to different time-step intervals, balancing convolution and self-attention operations based on observed frequency characteristics. We also introduce a soft interval assignment strategy for comprehensive training. Empirically, MEME operates 3.3 times faster than baselines while improving image generation quality (FID scores) by 0.62 (FFHQ) and 0.37 (CelebA). Though we validate the effectiveness of assigning more optimal architecture per time-step, where efficient models outperform the larger models, we argue that MEME opens a new design choice for diffusion models that can be easily applied in other scenarios, such as large multi-expert models.
Cite
Text
Lee et al. "Multi-Architecture Multi-Expert Diffusion Models." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I12.29245Markdown
[Lee et al. "Multi-Architecture Multi-Expert Diffusion Models." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/lee2024aaai-multi/) doi:10.1609/AAAI.V38I12.29245BibTeX
@inproceedings{lee2024aaai-multi,
title = {{Multi-Architecture Multi-Expert Diffusion Models}},
author = {Lee, Yunsung and Kim, JinYoung and Go, Hyojun and Jeong, Myeongho and Oh, Shinhyeok and Choi, Seungtaek},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2024},
pages = {13427-13436},
doi = {10.1609/AAAI.V38I12.29245},
url = {https://mlanthology.org/aaai/2024/lee2024aaai-multi/}
}