EdiBERT: A Generative Model for Image Editing

Abstract

Advances in computer vision are pushing the limits of image manipulation, with generative models sampling highly-realistic detailed images on various tasks. However, a specialized model is often developed and trained for each specific task, even though many image edition tasks share similarities. In denoising, inpainting, or image compositing, one always aims at generating a realistic image from a low-quality one. In this paper, we aim at making a step towards a unified approach for image editing. To do so, we propose EdiBERT, a bidirectional transformer that re-samples image patches conditionally to a given image. Using one generic objective, we show that the model resulting from a single training matches state-of-the-art GANs inversion on several tasks: image denoising, image completion, and image composition. We also provide several insights on the latent space of vector-quantized auto-encoders, such as locality and reconstruction capacities. The code is available at https://github.com/EdiBERT4ImageManipulation/EdiBERT.

Cite

Text

Issenhuth et al. "EdiBERT: A Generative Model for Image Editing." Transactions on Machine Learning Research, 2023.

Markdown

[Issenhuth et al. "EdiBERT: A Generative Model for Image Editing." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/issenhuth2023tmlr-edibert/)

BibTeX

@article{issenhuth2023tmlr-edibert,
  title     = {{EdiBERT: A Generative Model for Image Editing}},
  author    = {Issenhuth, Thibaut and Tanielian, Ugo and Mary, Jeremie and Picard, David},
  journal   = {Transactions on Machine Learning Research},
  year      = {2023},
  url       = {https://mlanthology.org/tmlr/2023/issenhuth2023tmlr-edibert/}
}