Perception Loss Function Adaptive to Rate for Learned Video Compression
Abstract
We consider causal, low-latency, sequential video compression, with mean squared-error (MSE) as the distortion loss, and a perception loss function (PLF) to enhance the realism of outputs. Prior works have employed two PLFs: one based on the joint distribution (JD) of all frames up to the current one, and the other based on frame-wise marginal distribution (FMD). We introduce a new PLF, called \emph{adaptive to rate (AR)}, which preserves the joint distribution of the current frame with all previous reconstructions. Through information-theoretic analysis and deep-learning experiments, we show that PLF-AR can rectify past errors in future reconstructions when the initial frame is compressed at a low bitrate. However, in this bitrate scenario, PLF-JD exhibits the error permanence phenomenon, propagating mistakes in subsequent outputs. When the initial frame is compressed at a high bitrate, PLF-AR maintains temporal correlation among frames, preventing error propagation in future reconstructions---unlike PLF-JD, which remains stuck in past mistakes. Furthermore, PLF-FMD does not preserve temporal correlation as effectively as PLF-AR. These characteristics of PLFs are especially apparent in scenarios with sharp frame movements. In contrast, when frame movements are smoother, the three PLFs display slight variations: PLF-AR and PLF-JD yield more diverse outputs, while PLF-FMD tends to replicate the initial frame in all future reconstructions. We validate our findings through information-theoretic analysis of the rate-distortion-perception tradeoff for the Gauss-Markov source model and deep-learning experiments on moving MNIST and UVG datasets.
Cite
Text
Salehkalaibar et al. "Perception Loss Function Adaptive to Rate for Learned Video Compression." NeurIPS 2024 Workshops: Compression, 2024.Markdown
[Salehkalaibar et al. "Perception Loss Function Adaptive to Rate for Learned Video Compression." NeurIPS 2024 Workshops: Compression, 2024.](https://mlanthology.org/neuripsw/2024/salehkalaibar2024neuripsw-perception/)BibTeX
@inproceedings{salehkalaibar2024neuripsw-perception,
title = {{Perception Loss Function Adaptive to Rate for Learned Video Compression}},
author = {Salehkalaibar, Sadaf and Phan, Buu and Dick, João Atz and Khisti, Ashish J and Chen, Jun and Yu, Wei},
booktitle = {NeurIPS 2024 Workshops: Compression},
year = {2024},
url = {https://mlanthology.org/neuripsw/2024/salehkalaibar2024neuripsw-perception/}
}