Exploring DCN-like Architecture for Fast Image Generation with Arbitrary Resolution

Abstract

Arbitrary-resolution image generation still remains a challenging task in AIGC, as it requires handling varying resolutions and aspect ratios while maintaining high visual quality. Existing transformer-based diffusion methods suffer from quadratic computation cost and limited resolution extrapolation capabilities, making them less effective for this task. In this paper, we propose FlowDCN, a purely convolution-based generative model with linear time and memory complexity, that can efficiently generate high-quality images at arbitrary resolutions. Equipped with a new design of learnable group-wise deformable convolution block, our FlowDCN yields higher flexibility and capability to handle different resolutions with a single model.FlowDCN achieves the state-of-the-art 4.30 sFID on $256\times256$ ImageNet Benchmark and comparable resolution extrapolation results, surpassing transformer-based counterparts in terms of convergence speed (only $\frac{1}{5}$ images), visual quality, parameters ($8\%$ reduction) and FLOPs ($20\%$ reduction). We believe FlowDCN offers a promising solution to scalable and flexible image synthesis.

Cite

Text

Wang et al. "Exploring DCN-like Architecture for Fast Image Generation with Arbitrary Resolution." Neural Information Processing Systems, 2024. doi:10.52202/079017-2792

Markdown

[Wang et al. "Exploring DCN-like Architecture for Fast Image Generation with Arbitrary Resolution." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/wang2024neurips-exploring-a/) doi:10.52202/079017-2792

BibTeX

@inproceedings{wang2024neurips-exploring-a,
  title     = {{Exploring DCN-like Architecture for Fast Image Generation with Arbitrary Resolution}},
  author    = {Wang, Shuai and Li, Zexian and Song, Tianhui and Li, Xubin and Ge, Tiezheng and Zheng, Bo and Wang, Limin},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2792},
  url       = {https://mlanthology.org/neurips/2024/wang2024neurips-exploring-a/}
}