LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition

Abstract

Vision Transformers (ViTs) have revolutionized large-scale visual modeling, yet remain underexplored in face recognition (FR) where CNNs still dominate. We identify a critical bottleneck: CNN-inspired training paradigms fail to unlock ViT's potential, leading to suboptimal performance and convergence instability.To address this challenge, we propose LVFace, a ViT-based FR model that integrates Progressive Cluster Optimization (PCO) to achieve superior results. Specifically, PCO sequentially applies negative class sub-sampling (NCS) for robust and fast feature alignment from random initialization, feature expectation penalties for centroid stabilization, performing cluster boundary refinement through full-batch training without NCS constraints. LVFace establishes a new state-of-the-art face recognition baseline, surpassing leading approaches such as UniFace and TopoFR across multiple benchmarks. Extensive experiments demonstrate that LVFace delivers consistent performance gains, while exhibiting scalability to large-scale datasets and compatibility with mainstream VLMs and LLMs. Notably, LVFace secured 1st place in the ICCV 2021 Masked Face Recognition (MFR)-Ongoing Challenge (March 2025), proving its efficacy in real-world scenarios. Project is available at https://github.com/bytedance/LVFace.

Cite

Text

You et al. "LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition." International Conference on Computer Vision, 2025.

Markdown

[You et al. "LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/you2025iccv-lvface/)

BibTeX

@inproceedings{you2025iccv-lvface,
  title     = {{LVFace: Progressive Cluster Optimization for Large Vision Models in Face Recognition}},
  author    = {You, Jinghan and Li, Shanglin and Sun, Yuanrui and Wei, Jiangchuan and Guo, Mingyu and Feng, Chao and Ran, Jiao},
  booktitle = {International Conference on Computer Vision},
  year      = {2025},
  pages     = {11840-11849},
  url       = {https://mlanthology.org/iccv/2025/you2025iccv-lvface/}
}