Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition

Cindy M. Nguyen, Eric R. Chan, Alexander W. Bergman, Gordon Wetzstein

WACV 2024 pp. 4146-4157

/wacv/2024/nguyen2024wacv-diffusion/

Abstract

Capturing images is a key part of automation for high-level tasks such as scene text recognition. Low-light conditions pose a challenge for high-level perception stacks, which are often optimized on well-lit, artifact-free images. Reconstruction methods for low-light images can produce well-lit counterparts, but typically at the cost of high-frequency details critical for downstream tasks. We propose Diffusion in the Dark (DiD), a diffusion model for low-light image reconstruction for text recognition. DiD provides qualitatively competitive reconstructions with that of state-of-the-art (SOTA), while preserving high-frequency details even in extremely noisy, dark conditions. We demonstrate that DiD, without any task-specific optimization, can outperform SOTA low-light methods in low-light text recognition on real images, bolstering the potential of diffusion models to solve ill-posed inverse problems.

PDF WACV Semantic Scholar

Cite

Text

Nguyen et al. "Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition." Winter Conference on Applications of Computer Vision, 2024.

Markdown

[Nguyen et al. "Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition." Winter Conference on Applications of Computer Vision, 2024.](https://mlanthology.org/wacv/2024/nguyen2024wacv-diffusion/)

BibTeX

@inproceedings{nguyen2024wacv-diffusion,
  title     = {{Diffusion in the Dark: A Diffusion Model for Low-Light Text Recognition}},
  author    = {Nguyen, Cindy M. and Chan, Eric R. and Bergman, Alexander W. and Wetzstein, Gordon},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2024},
  pages     = {4146-4157},
  url       = {https://mlanthology.org/wacv/2024/nguyen2024wacv-diffusion/}
}