Continuously Masked Transformer for Image Inpainting
Abstract
A novel continuous-mask-aware transformer for image inpainting, called CMT, is proposed in this paper, which uses a continuous mask to represent the amounts of errors in tokens. First, we initialize a mask and use it during the self-attention. To facilitate the masked self-attention, we also introduce the notion of overlapping tokens. Second, we update the mask by modeling the error propagation during the masked self-attention. Through several masked self-attention and mask update (MSAU) layers, we predict initial inpainting results. Finally, we refine the initial results to reconstruct a more faithful image. Experimental results on multiple datasets show that the proposed CMT algorithm outperforms existing algorithms significantly. The source codes are available at https://github.com/keunsoo-ko/CMT.
Cite
Text
Ko and Kim. "Continuously Masked Transformer for Image Inpainting." International Conference on Computer Vision, 2023. doi:10.1109/ICCV51070.2023.01211Markdown
[Ko and Kim. "Continuously Masked Transformer for Image Inpainting." International Conference on Computer Vision, 2023.](https://mlanthology.org/iccv/2023/ko2023iccv-continuously/) doi:10.1109/ICCV51070.2023.01211BibTeX
@inproceedings{ko2023iccv-continuously,
title = {{Continuously Masked Transformer for Image Inpainting}},
author = {Ko, Keunsoo and Kim, Chang-Su},
booktitle = {International Conference on Computer Vision},
year = {2023},
pages = {13169-13178},
doi = {10.1109/ICCV51070.2023.01211},
url = {https://mlanthology.org/iccv/2023/ko2023iccv-continuously/}
}