Deep Learning in Latent Space for Video Prediction and Compression

Liu, Bowen; Chen, Yu; Liu, Shiyu; Kim, Hun-Seok

doi:10.1109/CVPR46437.2021.00076

Deep Learning in Latent Space for Video Prediction and Compression

Bowen Liu, Yu Chen, Shiyu Liu, Hun-Seok Kim

CVPR 2021 pp. 701-710

doi:10.1109/CVPR46437.2021.00076 /cvpr/2021/liu2021cvpr-deep-b/

Abstract

Learning-based video compression has achieved substantial progress during recent years. The most influential approaches adopt deep neural networks (DNNs) to remove spatial and temporal redundancies by finding the appropriate lower-dimensional representations of frames in the video. We propose a novel DNN based framework that predicts and compresses video sequences in the latent vector space. The proposed method first learns the efficient lower-dimensional latent space representation of each video frame and then performs inter-frame prediction in that latent domain. The proposed latent domain compression of individual frames is obtained by a deep autoencoder trained with a generative adversarial network (GAN). To exploit the temporal correlation within the video frame sequence, we employ a convolutional long short-term memory (ConvLSTM) network to predict the latent vector representation of the future frame. We demonstrate our method with two applications; video compression and abnormal event detection that share the identical latent frame prediction network. The proposed method exhibits superior or competitive performance compared to the state-of-the-art algorithms specifically designed for either video compression or anomaly detection.

PDF CVPR Semantic Scholar

Cite

Text

Liu et al. "Deep Learning in Latent Space for Video Prediction and Compression." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.00076

Markdown

[Liu et al. "Deep Learning in Latent Space for Video Prediction and Compression." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/liu2021cvpr-deep-b/) doi:10.1109/CVPR46437.2021.00076

BibTeX

@inproceedings{liu2021cvpr-deep-b,
  title     = {{Deep Learning in Latent Space for Video Prediction and Compression}},
  author    = {Liu, Bowen and Chen, Yu and Liu, Shiyu and Kim, Hun-Seok},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {701-710},
  doi       = {10.1109/CVPR46437.2021.00076},
  url       = {https://mlanthology.org/cvpr/2021/liu2021cvpr-deep-b/}
}