Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding

Abstract

Typical video compression systems consist of two main modules: motion coding and residual coding. This general architecture is adopted by classical coding schemes (such as international standards H.265 and H.266) and deep learning-based coding schemes. We propose a novel B-frame coding architecture based on two-layer Conditional Augmented Normalization Flows (CANF). It has the striking feature of not transmitting any motion information. Our proposed idea of video compression without motion coding offers a new direction for learned video coding. Our base layer is a low-resolution image compressor that replaces the full-resolution motion compressor. The low-resolution coded image is merged with the warped high-resolution images to generate a high-quality image as a conditioning signal for the enhancement-layer image coding in full resolution. One advantage of this architecture is significantly reduced computational complexity due to eliminating the motion information compressor. In addition, we adopt a skip-mode coding technique to reduce the transmitted latent samples. The rate-distortion performance of our scheme is slightly lower than that of the state-of-the-art learned B-frame coding scheme, B-CANF, but outperforms other learned B-frame coding schemes. However, compared to B-CANF, our scheme saves 45% of multiply-accumulate operations (MACs) for encoding and 27% of MACs for decoding. The code is available at https://nycu-clab.github.io.

Cite

Text

Alexandre et al. "Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.00988

Markdown

[Alexandre et al. "Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/alexandre2023cvpr-hierarchical/) doi:10.1109/CVPR52729.2023.00988

BibTeX

@inproceedings{alexandre2023cvpr-hierarchical,
  title     = {{Hierarchical B-Frame Video Coding Using Two-Layer CANF Without Motion Coding}},
  author    = {Alexandre, David and Hang, Hsueh-Ming and Peng, Wen-Hsiao},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {10249-10258},
  doi       = {10.1109/CVPR52729.2023.00988},
  url       = {https://mlanthology.org/cvpr/2023/alexandre2023cvpr-hierarchical/}
}