Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

Zhong, Yunshan; Zhou, Yuyao; Zhang, Yuxin; Sui, Wanchen; Li, Shen; Li, Yong; Chao, Fei; Ji, Rongrong

Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers

Yunshan Zhong, Yuyao Zhou, Yuxin Zhang, Wanchen Sui, Shen Li, Yong Li, Fei Chao, Rongrong Ji

ICCV 2025 pp. 12479-12490

/iccv/2025/zhong2025iccv-semantic/

Abstract

Data-free quantization (DFQ) enables model quantization without accessing real data, addressing concerns regarding data security and privacy. With the growing adoption of Vision Transformers (ViTs), DFQ for ViTs has garnered significant attention. However, existing DFQ methods exhibit two limitations: (1) semantic distortion, where the semantics of synthetic images deviate substantially from those of real images, and (2) semantic inadequacy, where synthetic images contain extensive regions with limited content and oversimplified textures, leading to suboptimal quantization performance. To address these limitations, we propose SARDFQ, a novel Semantics Alignment and Reinforcement Data-Free Quantization method for ViTs. To address semantic distortion, SARDFQ incorporates Attention Priors Alignment (APA), which optimizes synthetic images to follow randomly generated structure attention priors. To mitigate semantic inadequacy, SARDFQ introduces Multi-Semantic Reinforcement (MSR), leveraging localized patch optimization to enhance semantic richness across synthetic images. Furthermore, SARDFQ employs Soft-Label Learning (SL), wherein multiple semantic targets are adapted to facilitate the learning of multi-semantic images augmented by MSR. Extensive experiments demonstrate the effectiveness of SARDFQ, significantly surpassing existing methods. For example, SARDFQ improves top-1 accuracy on ImageNet by 15.52% for W4A4 ViT-B

PDF ICCV Semantic Scholar

Cite

Text

Zhong et al. "Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers." International Conference on Computer Vision, 2025.

Markdown

[Zhong et al. "Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/zhong2025iccv-semantic/)

BibTeX

@inproceedings{zhong2025iccv-semantic,
  title     = {{Semantic Alignment and Reinforcement for Data-Free Quantization of Vision Transformers}},
  author    = {Zhong, Yunshan and Zhou, Yuyao and Zhang, Yuxin and Sui, Wanchen and Li, Shen and Li, Yong and Chao, Fei and Ji, Rongrong},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {12479-12490},
  url       = {https://mlanthology.org/iccv/2025/zhong2025iccv-semantic/}
}