Spatially-Adaptive Pixelwise Networks for Fast Image Translation

Tamar Rott Shaham, Michael Gharbi, Richard Zhang, Eli Shechtman, Tomer Michaeli

CVPR 2021 pp. 14882-14891

doi:10.1109/CVPR46437.2021.01464 /cvpr/2021/shaham2021cvpr-spatiallyadaptive/

Abstract

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three important steps to equip such a seemingly simple function with adequate expressivity. First, the parameters of the pixel-wise networks are spatially varying so they can represent a broader function class than simple 1x1 convolutions. Second, these parameters are predicted by a fast convolutional network that processes an aggressively low-resolution representation of the input. Third, we augment the input image with a sinusoidal encoding of spatial coordinates, which provides an effective inductive bias for generating realistic novel high-frequency image content. As a result, our model is up to 18x faster than state-of-the-art baselines. We achieve this speedup while generating comparable visual quality across different image resolutions and translation domains.

PDF CVPR Semantic Scholar

Cite

Text

Shaham et al. "Spatially-Adaptive Pixelwise Networks for Fast Image Translation." Conference on Computer Vision and Pattern Recognition, 2021. doi:10.1109/CVPR46437.2021.01464

Markdown

[Shaham et al. "Spatially-Adaptive Pixelwise Networks for Fast Image Translation." Conference on Computer Vision and Pattern Recognition, 2021.](https://mlanthology.org/cvpr/2021/shaham2021cvpr-spatiallyadaptive/) doi:10.1109/CVPR46437.2021.01464

BibTeX

@inproceedings{shaham2021cvpr-spatiallyadaptive,
  title     = {{Spatially-Adaptive Pixelwise Networks for Fast Image Translation}},
  author    = {Shaham, Tamar Rott and Gharbi, Michael and Zhang, Richard and Shechtman, Eli and Michaeli, Tomer},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2021},
  pages     = {14882-14891},
  doi       = {10.1109/CVPR46437.2021.01464},
  url       = {https://mlanthology.org/cvpr/2021/shaham2021cvpr-spatiallyadaptive/}
}