Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition

Lu, Xilong; Yu, Jun; Zhang, Yunxiang; Zhu, Lingsi; Zheng, Yang; Wang, Yongqi; Ling, Qiang

Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition

Xilong Lu, Jun Yu, Yunxiang Zhang, Lingsi Zhu, Yang Zheng, Yongqi Wang, Qiang Ling

CVPRW 2025 pp. 5770-5777

/cvprw/2025/lu2025cvprw-robust/

Abstract

Compound Expression Recognition (CER) is crucial for understanding human emotions and improving human-computer interaction. However, CER faces challenges due to the complexity of facial expressions and the difficulty of capturing subtle emotional cues. To surmount these obstacles, we present a novel approach that harnesses the power of Large Vision-Language Models (LVLMs). Our methodology incorporates a two-stage fine-tuning process, complemented by the design of exclusive prompts. In the first stage, pre-trained LVLMs are fine-tuned on basic facial expressions to establish fundamental patterns. Subsequently, in the second stage, the model is further optimized on a compound-expression dataset to refine the interactions between compound expressions. Our approach has achieved remarkable results. It has attained advanced accuracy on the RAF-DB dataset and demonstrated robust zero-shot generalization on the C-EXPR-DB dataset. Notably, in the 8th ABAW Compound Expression Recognition Challenge, our method secured the first place with an F1 score of 0.5723, highlighting its great potential for real-world applications in emotion analysis and human-computer interaction.

PDF CVPRW Semantic Scholar

Cite

Text

Lu et al. "Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Lu et al. "Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/lu2025cvprw-robust/)

BibTeX

@inproceedings{lu2025cvprw-robust,
  title     = {{Robust Stage-Wise LVLM Adaptation: Multi-Phase Prompt LoRA Fine-Tuning for Compound Expression Recognition}},
  author    = {Lu, Xilong and Yu, Jun and Zhang, Yunxiang and Zhu, Lingsi and Zheng, Yang and Wang, Yongqi and Ling, Qiang},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {5770-5777},
  url       = {https://mlanthology.org/cvprw/2025/lu2025cvprw-robust/}
}