VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment

Abstract

Text-driven video editing has recently experienced rapid development. Despite this, evaluating edited videos remains a considerable challenge. Current metrics tend to fail to align with human perceptions, and effective quantitative metrics for video editing are still notably absent. To address this, we introduce VE-Bench, a benchmark suite tailored to the assessment of text-driven video editing. This suite includes VE-Bench DB, a video quality assessment (VQA) database for video editing. VE-Bench DB encompasses a diverse set of source videos featuring various motions and subjects, along with multiple distinct editing prompts, editing results from 8 different models, and the corresponding Mean Opinion Scores (MOS) from 24 human annotators. Based on VE-Bench DB, we further propose VE-Bench QA, a quantitative human-aligned measurement for the text-driven video editing task. In addition to the aesthetic, distortion, and other visual quality indicators that traditional VQA methods emphasize, VE-Bench QA focuses on the text-video alignment and the relevance modeling between source and edited videos. It introduces a new assessment network for video editing that attains superior performance in alignment with human preferences.To the best of our knowledge, VE-Bench introduces the first quality assessment dataset for video editing and proposes an effective subjective-aligned quantitative metric for this domain. All models, data, and code will be publicly available to the community.

Cite

Text

Sun et al. "VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I7.32763

Markdown

[Sun et al. "VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/sun2025aaai-ve/) doi:10.1609/AAAI.V39I7.32763

BibTeX

@inproceedings{sun2025aaai-ve,
  title     = {{VE-Bench: Subjective-Aligned Benchmark Suite for Text-Driven Video Editing Quality Assessment}},
  author    = {Sun, Shangkun and Liang, Xiaoyu and Fan, Songlin and Gao, Wenxu and Gao, Wei},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {7105-7113},
  doi       = {10.1609/AAAI.V39I7.32763},
  url       = {https://mlanthology.org/aaai/2025/sun2025aaai-ve/}
}