PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild

Abstract

This report provides a comprehensive overview of the 4th Pixel-level Video Understanding in the Wild (PVUW) Challenge, held in conjunction with CVPR 2025. It summarizes the challenge outcomes, participating methodologies, and future research directions. The challenge features two tracks: MOSE, which focuses on complex scene video object segmentation, and MeViS, which targets motion-guided, language-based video segmentation. Both tracks introduce new, more challenging datasets designed to better reflect real-world scenarios. Through detailed evaluation and analysis, the challenge offers valuable insights into the current state-of-the-art and emerging trends in complex video segmentation. More information can be found on the workshop website: https://pvuw.github.io/.

Cite

Text

Ding et al. "PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.

Markdown

[Ding et al. "PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2025.](https://mlanthology.org/cvprw/2025/ding2025cvprw-pvuw/)

BibTeX

@inproceedings{ding2025cvprw-pvuw,
  title     = {{PVUW 2025 Challenge Report: Advances in Pixel-Level Understanding of Complex Videos in the Wild}},
  author    = {Ding, Henghui and Liu, Chang and Ravi, Nikhila and He, Shuting and Wei, Yunchao and Bai, Song and Torr, Philip},
  booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
  year      = {2025},
  pages     = {2669-2678},
  url       = {https://mlanthology.org/cvprw/2025/ding2025cvprw-pvuw/}
}