L-CoDe: Language-Based Colorization Using Color-Object Decoupled Conditions

Abstract

Colorizing a grayscale image is inherently an ill-posed problem with multi-modal uncertainty. Language-based colorization offers a natural way of interaction to reduce such uncertainty via a user-provided caption. However, the color-object coupling and mismatch issues make the mapping from word to color difficult. In this paper, we propose L-CoDe, a Language-based Colorization network using color-object Decoupled conditions. A predictor for object-color corresponding matrix (OCCM) and a novel attention transfer module (ATM) are introduced to solve the color-object coupling problem. To deal with color-object mismatch that results in incorrect color-object correspondence, we adopt a soft-gated injection module (SIM). We further present a new dataset containing annotated color-object pairs to provide supervisory signals for resolving the coupling problem. Experimental results show that our approach outperforms state-of-the-art methods conditioned on captions.

Cite

Text

Weng et al. "L-CoDe: Language-Based Colorization Using Color-Object Decoupled Conditions." AAAI Conference on Artificial Intelligence, 2022. doi:10.1609/AAAI.V36I3.20170

Markdown

[Weng et al. "L-CoDe: Language-Based Colorization Using Color-Object Decoupled Conditions." AAAI Conference on Artificial Intelligence, 2022.](https://mlanthology.org/aaai/2022/weng2022aaai-l/) doi:10.1609/AAAI.V36I3.20170

BibTeX

@inproceedings{weng2022aaai-l,
  title     = {{L-CoDe: Language-Based Colorization Using Color-Object Decoupled Conditions}},
  author    = {Weng, Shuchen and Wu, Hao and Chang, Zheng and Tang, Jiajun and Li, Si and Shi, Boxin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2022},
  pages     = {2677-2684},
  doi       = {10.1609/AAAI.V36I3.20170},
  url       = {https://mlanthology.org/aaai/2022/weng2022aaai-l/}
}