Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time
Abstract
We present a video decomposition method that facilitates layer-based editing of videos with spatiotemporally varying lighting and motion effects. Our neural model decomposes an input video into multiple layered representations, each comprising a 2D texture map, a mask for the original video, and a multiplicative residual characterizing the spatiotemporal variations in lighting conditions. A single edit on the texture maps can be propagated to the corresponding locations in the entire video frames while preserving other contents' consistencies. Our method efficiently learns the layer-based neural representations of a 1080p video in 25s per frame via coordinate hashing and allows real-time rendering of the edited result at 71 fps on a single GPU. Qualitatively, we run our method on various videos to show its effectiveness in generating high-quality editing effects. Quantitatively, we propose to adopt feature-tracking evaluation metrics for objectively assessing the consistency of video editing. Project page: https://lightbulb12294.github.io/hashing-nvd/
Cite
Text
Chan et al. "Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.00712Markdown
[Chan et al. "Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/chan2023iccv-hashing/) doi:10.1109/ICCV51070.2023.00712BibTeX
@inproceedings{chan2023iccv-hashing,
title = {{Hashing Neural Video Decomposition with Multiplicative Residuals in Space-Time}},
author = {Chan, Cheng-Hung and Yuan, Cheng-Yang and Sun, Cheng and Chen, Hwann-Tzong},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {7743-7753},
doi = {10.1109/ICCV51070.2023.00712},
url = {https://mlanthology.org/iccv/2023/chan2023iccv-hashing/}
}