Towards Open-Ended Visual Quality Comparison

Wu, Haoning; Zhu, Hanwei; Zhang, Zicheng; Zhang, Erli; Chen, Chaofeng; Liao, Liang; Li, Chunyi; Wang, Annan; Sun, Wenxiu; Yan, Qiong; Liu, Xiaohong; Zhai, Guangtao; Wang, Shiqi; Lin, Weisi

doi:10.1007/978-3-031-72646-0_21

Towards Open-Ended Visual Quality Comparison

Haoning Wu, Hanwei Zhu, Zicheng Zhang, Erli Zhang, Chaofeng Chen, Liang Liao, Chunyi Li, Annan Wang, Wenxiu Sun, Qiong Yan, Xiaohong Liu, Guangtao Zhai, Shiqi Wang, Weisi Lin

ECCV 2024

doi:10.1007/978-3-031-72646-0_21 /eccv/2024/wu2024eccv-openended/

Abstract

Comparative settings (pairwise choice, listwise ranking) have been adopted by a wide range of subjective studies for image quality assessment (IQA), as it inherently standardizes the evaluation criteria across different observers and offer more clear-cut responses. In this work, we extend the edge of emerging large multi-modality models (LMMs) to further advance visual quality comparison into open-ended settings, that 1) can respond to deepgreenopen-range questions on quality comparison; 2) can provide deepgreendetailed reasonings beyond direct answers. To this end, we propose the . To train this first-of-its-kind open-source open-ended visual quality comparer, we collect the Co-Instruct-562K dataset, from two sources: (a) LLM-merged single image quality description, (b) GPT-4V “teacher” responses on unlabeled data. Furthermore, to better evaluate this setting, we propose the , the first benchmark on multi-image comparison for LMMs. We demonstrate that not only achieves in average 30% higher accuracy than state-of-the-art open-source LMMs, but also outperforms GPT-4V (its teacher ), on both existing related benchmarks and the proposed . Our code, model and data are released on https://github.com/Q-Future/ Co-Instruct.

PDF ECCV Semantic Scholar

Cite

Text

Wu et al. "Towards Open-Ended Visual Quality Comparison." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-72646-0_21

Markdown

[Wu et al. "Towards Open-Ended Visual Quality Comparison." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/wu2024eccv-openended/) doi:10.1007/978-3-031-72646-0_21

BibTeX

@inproceedings{wu2024eccv-openended,
  title     = {{Towards Open-Ended Visual Quality Comparison}},
  author    = {Wu, Haoning and Zhu, Hanwei and Zhang, Zicheng and Zhang, Erli and Chen, Chaofeng and Liao, Liang and Li, Chunyi and Wang, Annan and Sun, Wenxiu and Yan, Qiong and Liu, Xiaohong and Zhai, Guangtao and Wang, Shiqi and Lin, Weisi},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-72646-0_21},
  url       = {https://mlanthology.org/eccv/2024/wu2024eccv-openended/}
}