L-CoIns: Language-Based Colorization with Instance Awareness

Abstract

Language-based colorization produces plausible colors consistent with the language description provided by the user. Recent studies introduce additional annotation to prevent color-object coupling and mismatch issues, but they still have difficulty in distinguishing instances corresponding to the same object words. In this paper, we propose a transformer-based framework to automatically aggregate similar image patches and achieve instance awareness without any additional knowledge. By applying our presented luminance augmentation and counter-color loss to break down the statistical correlation between luminance and color words, our model is driven to synthesize colors with better descriptive consistency. We further collect a dataset to provide distinctive visual characteristics and detailed language descriptions for multiple instances in the same image. Extensive experiments demonstrate our advantages of synthesizing visually pleasing and description-consistent results of instance-aware colorization.

Cite

Text

Chang et al. "L-CoIns: Language-Based Colorization with Instance Awareness." Conference on Computer Vision and Pattern Recognition, 2023. doi:10.1109/CVPR52729.2023.01842

Markdown

[Chang et al. "L-CoIns: Language-Based Colorization with Instance Awareness." Conference on Computer Vision and Pattern Recognition, 2023.](https://mlanthology.org/cvpr/2023/chang2023cvpr-lcoins/) doi:10.1109/CVPR52729.2023.01842

BibTeX

@inproceedings{chang2023cvpr-lcoins,
  title     = {{L-CoIns: Language-Based Colorization with Instance Awareness}},
  author    = {Chang, Zheng and Weng, Shuchen and Zhang, Peixuan and Li, Yu and Li, Si and Shi, Boxin},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2023},
  pages     = {19221-19230},
  doi       = {10.1109/CVPR52729.2023.01842},
  url       = {https://mlanthology.org/cvpr/2023/chang2023cvpr-lcoins/}
}