Patch-VQ: 'Patching Up' the Video Quality Problem

Abstract

No-reference (NR) perceptual video quality assessment (VQA) is a complex, unsolved, and important problem for social and streaming media applications. Efficient and accurate video quality predictors are needed to monitor and guide the processing of billions of shared, often imperfect, user-generated content (UGC). Unfortunately, current NR models are limited in their prediction capabilities on real-world, "in-the-wild" UGC video data. To advance progress on this problem, we created the largest (by far) subjective video quality dataset, containing 38,811 real-world distorted videos and 116,433 space-time localized video patches ('v-patches'), and 5.5M human perceptual quality annotations. Using this, we created two unique NR-VQA models: (a) a local-to-global region-based NR VQA architecture (called PVQ) that learns to predict global video quality and achieves state-of-the-art performance on 3 UGC datasets, and (b) a first-of-a-kind space-time video quality mapping engine (called PVQ Mapper) that helps localize and visualize perceptual distortions in space and time. The entire dataset and prediction models are freely available at https://live.ece.utexas.edu/research.php.

Cite

Text

Ying et al. "Patch-VQ: 'Patching Up' the Video Quality Problem." Conference on Computer Vision and Pattern Recognition, 2021.

Markdown

[Ying et al. "Patch-VQ: 'Patching Up' the Video Quality Problem." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/ying2021cvpr-patchvq/)

BibTeX

@inproceedings{ying2021cvpr-patchvq,
  title     = {{Patch-VQ: 'Patching Up' the Video Quality Problem}},
  author    = {Ying, Zhenqiang and Mandal, Maniratnam and Ghadiyaram, Deepti and Bovik, Alan},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {14019-14029},
  url       = {https://mlanthology.org/cvpr/2021/ying2021cvpr-patchvq/}
}