Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
Abstract
Video face restoration faces a critical challenge in maintaining temporal consistency while recovering fine facial details from degraded inputs. This paper presents a novel approach that extends Vector-Quantized Variational Autoencoders (VQ-VAEs), pretrained on static high-quality portraits, into a video restoration framework through variational latent space modeling. Our key innovation lies in reformulating discrete codebook representations as Dirichlet-distributed continuous variables, enabling probabilistic transitions between facial features across frames. A spatio-temporal Transformer architecture jointly models inter-frame dependencies and predicts latent distributions, while a Laplacian-constrained reconstruction loss combined with perceptual (LPIPS) regularization enhances both pixel accuracy and visual quality. Comprehensive evaluations on blind face restoration, video inpainting, and facial colorization tasks demonstrate state-of-the-art performance. This work establishes an effective paradigm for adapting intensive image priors, pretrained on high-quality images, to video restoration while addressing the critical challenge of flicker artifacts. The source code has been open-sourced and is available at https://github.com/fudan-generative-vision/DicFace.
Cite
Text
Chen et al. "Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration." International Conference on Computer Vision, 2025.Markdown
[Chen et al. "Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration." International Conference on Computer Vision, 2025.](https://mlanthology.org/iccv/2025/chen2025iccv-dirichletconstrained/)BibTeX
@inproceedings{chen2025iccv-dirichletconstrained,
title = {{Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration}},
author = {Chen, Baoyou and Liu, Ce and Yuan, Weihao and Dong, Zilong and Zhu, Siyu},
booktitle = {International Conference on Computer Vision},
year = {2025},
pages = {14507-14516},
url = {https://mlanthology.org/iccv/2025/chen2025iccv-dirichletconstrained/}
}