Learned Video Compression via Joint Spatial-Temporal Correlation Exploration
Abstract
Traditional video compression technologies have been developed over decades in pursuit of higher coding efficiency. Efficient temporal information representation plays a key role in video coding. Thus, in this paper, we propose to exploit the temporal correlation using both first-order optical flow and second-order flow prediction. We suggest an one-stage learning approach to encapsulate flow as quantized features from consecutive frames which is then entropy coded with adaptive contexts conditioned on joint spatial-temporal priors to exploit second-order correlations. Joint priors are embedded in autoregressive spatial neighbors, co-located hyper elements and temporal neighbors using ConvLSTM recurrently. We evaluate our approach for the low-delay scenario with High-Efficiency Video Coding (H.265/HEVC), H.264/AVC and another learned video compression method, following the common test settings. Our work offers the state-of-the-art performance, with consistent gains across all popular test sequences.
Cite
Text
Liu et al. "Learned Video Compression via Joint Spatial-Temporal Correlation Exploration." AAAI Conference on Artificial Intelligence, 2020. doi:10.1609/AAAI.V34I07.6825Markdown
[Liu et al. "Learned Video Compression via Joint Spatial-Temporal Correlation Exploration." AAAI Conference on Artificial Intelligence, 2020.](https://mlanthology.org/aaai/2020/liu2020aaai-learned/) doi:10.1609/AAAI.V34I07.6825BibTeX
@inproceedings{liu2020aaai-learned,
title = {{Learned Video Compression via Joint Spatial-Temporal Correlation Exploration}},
author = {Liu, Haojie and Shen, Han and Huang, Lichao and Lu, Ming and Chen, Tong and Ma, Zhan},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2020},
pages = {11580-11587},
doi = {10.1609/AAAI.V34I07.6825},
url = {https://mlanthology.org/aaai/2020/liu2020aaai-learned/}
}