FMBoost: Boosting Latent Diffusion with Flow Matching

Fischer, Johannes S; Gui, Ming; Ma, Pingchuan; Stracke, Nick; Baumann, Stefan Andreas; Hu, Vincent Tao; Ommer, Björn

doi:10.1007/978-3-031-73030-6_19

FMBoost: Boosting Latent Diffusion with Flow Matching

Johannes S Fischer, Ming Gui, Pingchuan Ma, Nick Stracke, Stefan Andreas Baumann, Vincent Tao Hu, Björn Ommer

ECCV 2024

doi:10.1007/978-3-031-73030-6_19 /eccv/2024/fischer2024eccv-fmboost/

Abstract

Visual synthesis has recently seen significant leaps in performance, largely due to breakthroughs in generative models. Diffusion models have been a key enabler, as they excel in image diversity. However, this comes at the cost of slow training and synthesis, which is only partially alleviated by latent diffusion. To this end, flow matching is an appealing approach due to its complementary characteristics of faster training and inference but less diverse synthesis. We demonstrate our FMBoost approach, which introduces flow matching between a frozen diffusion model and a convolutional decoder that enables high-resolution image synthesis at reduced computational cost and model size. A small diffusion model can then effectively provide the necessary visual diversity, while flow matching efficiently enhances resolution and detail by mapping the small to a high-dimensional latent space, producing high-resolution images. Combining the diversity of diffusion models, the efficiency of flow matching, and the effectiveness of convolutional decoders, state-of-the-art high-resolution image synthesis is achieved at 10242 pixels with minimal computational cost. Cascading FMBoost optionally boosts this further to 20482 pixels. Importantly, this approach is orthogonal to recent approximation and speed-up strategies for the underlying model, making it easily integrable into the various diffusion model frameworks.

PDF ECCV Semantic Scholar

Cite

Text

Fischer et al. "FMBoost: Boosting Latent Diffusion with Flow Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73030-6_19

Markdown

[Fischer et al. "FMBoost: Boosting Latent Diffusion with Flow Matching." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/fischer2024eccv-fmboost/) doi:10.1007/978-3-031-73030-6_19

BibTeX

@inproceedings{fischer2024eccv-fmboost,
  title     = {{FMBoost: Boosting Latent Diffusion with Flow Matching}},
  author    = {Fischer, Johannes S and Gui, Ming and Ma, Pingchuan and Stracke, Nick and Baumann, Stefan Andreas and Hu, Vincent Tao and Ommer, Björn},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73030-6_19},
  url       = {https://mlanthology.org/eccv/2024/fischer2024eccv-fmboost/}
}