Lightweight Video Denoising Using Aggregated Shifted Window Attention
Abstract
Video denoising is a fundamental problem in numerous computer vision applications. State-of-the-art attention-based denoising methods typically yield good results, but require vast amounts of GPU memory and usually suffer from very long computation times. Especially in the field of restoring digitized high-resolution historic films, these techniques are not applicable in practice. To overcome these issues, we introduce a lightweight video denoising network that combines efficient axial-coronal-sagittal (ACS) convolutions with a novel shifted window attention formulation (ASwin), which is based on the memory-efficient aggregation of self- and cross-attention across video frames. We numerically validate the performance and efficiency of our approach on synthetic Gaussian noise. Moreover, we train our network as a general-purpose blind denoising model for real-world videos, using a realistic noise synthesis pipeline to generate clean-noisy video pairs. A user study and non- reference quality assessment prove that our method outperforms the state-of-the-art on real-world historic videos in terms of denoising performance and temporal consistency.
Cite
Text
Lindner et al. "Lightweight Video Denoising Using Aggregated Shifted Window Attention." Winter Conference on Applications of Computer Vision, 2023.Markdown
[Lindner et al. "Lightweight Video Denoising Using Aggregated Shifted Window Attention." Winter Conference on Applications of Computer Vision, 2023.](https://mlanthology.org/wacv/2023/lindner2023wacv-lightweight/)BibTeX
@inproceedings{lindner2023wacv-lightweight,
title = {{Lightweight Video Denoising Using Aggregated Shifted Window Attention}},
author = {Lindner, Lydia and Effland, Alexander and Ilic, Filip and Pock, Thomas and Kobler, Erich},
booktitle = {Winter Conference on Applications of Computer Vision},
year = {2023},
pages = {351-360},
url = {https://mlanthology.org/wacv/2023/lindner2023wacv-lightweight/}
}