Joint Motion and Residual Information Latent Representation for P-Frame Coding
Abstract
This paper proposes an inter-frame prediction frame encoding for the P-frame video compression challenge of the Workshop and Challenge on Learned Image Compression (CLIC). For this challenge, we use an uncompressed reference (previous) frame to compress the current frame. So, this is not a complete solution for learning-based video compression. The main goal is to represent a set of frames with an average of 0.075 bpp (bits per pixel), which is a very low bitrate. A restriction on the model size is also requested to avoid overfitting. Here we propose an autoencoder architecture that jointly represents the motion and residue information at the latent space. Three trained models were used to achieve the target bpp and a bit allocation algorithm is also proposed to optimize the quality performance of the encoded dataset.
Cite
Text
da Silva et al. "Joint Motion and Residual Information Latent Representation for P-Frame Coding." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020. doi:10.1109/CVPRW50498.2020.00081Markdown
[da Silva et al. "Joint Motion and Residual Information Latent Representation for P-Frame Coding." IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, 2020.](https://mlanthology.org/cvprw/2020/dasilva2020cvprw-joint/) doi:10.1109/CVPRW50498.2020.00081BibTeX
@inproceedings{dasilva2020cvprw-joint,
title = {{Joint Motion and Residual Information Latent Representation for P-Frame Coding}},
author = {da Silva, Renam Castro and Júnior, Nilson Donizete Guerin and Sanches, Pedro and Jung, Henrique Costa and Peixoto, Eduardo and Macchiavello, Bruno and Hung, Edson M. and Testoni, Vanessa and Freitas, Pedro Garcia},
booktitle = {IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops},
year = {2020},
pages = {590-592},
doi = {10.1109/CVPRW50498.2020.00081},
url = {https://mlanthology.org/cvprw/2020/dasilva2020cvprw-joint/}
}