Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

Xu, Kepeng; Xu, Li; He, Gang; Yu, Wenxin; Li, Yunsong

doi:10.24963/ijcai.2024/165

Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer

Kepeng Xu, Li Xu, Gang He, Wenxin Yu, Yunsong Li

IJCAI 2024 pp. 1489-1497

doi:10.24963/ijcai.2024/165 /ijcai/2024/xu2024ijcai-beyond/

Abstract

Traditional unsupervised domain adaptation (UDA) struggles to extract rich semantics due to backbone limitations. Recent large-scale pre-trained visual-language models (VLMs) have shown strong zero-shot learning capabilities in UDA tasks. However, directly using VLMs results in a mixture of semantic and domain-specific information, complicating knowledge transfer. Complex scenes with subtle semantic differences are prone to misclassification, which in turn can result in the loss of features that are crucial for distinguishing between classes. To address these challenges, we propose a novel counterfactual knowledge maintenance UDA framework. Specifically, we employ counterfactual disentanglement to separate the representation of semantic information from domain features, thereby reducing domain bias. Furthermore, to clarify ambiguous visual information specific to classes, we maintain the discriminative knowledge of both visual and textual information. This approach synergistically leverages multimodal information to preserve modality-specific distinguishable features. We conducted extensive experimental evaluations on several public datasets to demonstrate the effectiveness of our method. The source code is available at https://github.com/LiYaolab/CMKUDA

PDF IJCAI Semantic Scholar

Cite

Text

Xu et al. "Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer." International Joint Conference on Artificial Intelligence, 2024. doi:10.24963/ijcai.2024/165

Markdown

[Xu et al. "Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer." International Joint Conference on Artificial Intelligence, 2024.](https://mlanthology.org/ijcai/2024/xu2024ijcai-beyond/) doi:10.24963/ijcai.2024/165

BibTeX

@inproceedings{xu2024ijcai-beyond,
  title     = {{Beyond Alignment: Blind Video Face Restoration via Parsing-Guided Temporal-Coherent Transformer}},
  author    = {Xu, Kepeng and Xu, Li and He, Gang and Yu, Wenxin and Li, Yunsong},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {1489-1497},
  doi       = {10.24963/ijcai.2024/165},
  url       = {https://mlanthology.org/ijcai/2024/xu2024ijcai-beyond/}
}