MAT: Mask-Aware Transformer for Large Hole Image Inpainting

Abstract

Recent studies have shown the importance of modeling long-range interactions in the inpainting problem. To achieve this goal, existing approaches exploit either standalone attention techniques or transformers, but usually under a low resolution in consideration of computational cost. In this paper, we present a novel transformer-based model for large hole inpainting, which unifies the merits of transformers and convolutions to efficiently process high-resolution images. We carefully design each component of our framework to guarantee the high fidelity and diversity of recovered images. Specifically, we customize an inpainting-oriented transformer block, where the attention module aggregates non-local information only from partial valid tokens, indicated by a dynamic mask. Extensive experiments demonstrate the state-of-the-art performance of the new model on multiple benchmark datasets. Code is released at https://github.com/fenglinglwb/MAT.

Cite

Text

Li et al. "MAT: Mask-Aware Transformer for Large Hole Image Inpainting." Conference on Computer Vision and Pattern Recognition, 2022. doi:10.1109/CVPR52688.2022.01049

Markdown

[Li et al. "MAT: Mask-Aware Transformer for Large Hole Image Inpainting." Conference on Computer Vision and Pattern Recognition, 2022.](https://mlanthology.org/cvpr/2022/li2022cvpr-mat/) doi:10.1109/CVPR52688.2022.01049

BibTeX

@inproceedings{li2022cvpr-mat,
  title     = {{MAT: Mask-Aware Transformer for Large Hole Image Inpainting}},
  author    = {Li, Wenbo and Lin, Zhe and Zhou, Kun and Qi, Lu and Wang, Yi and Jia, Jiaya},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2022},
  pages     = {10758-10768},
  doi       = {10.1109/CVPR52688.2022.01049},
  url       = {https://mlanthology.org/cvpr/2022/li2022cvpr-mat/}
}