Watching It in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination

Abstract

Low illumination significantly impacts the performance of learning-based models trained under well-lit conditions. While current methods mitigate this issue through either image-level enhancement or feature-level adaptation, they often focus solely on the image itself, ignoring how the task-relevant target varies along with different illumination. In this paper, we propose a target-aware representation learning framework designed to improve high-level task performance in low-illumination environments. We achieve a bi-directional domain alignment from both image appearance and semantic features to bridge data across different illumination conditions. To concentrate more effectively on the target, we design a target highlighting strategy, incorporated with the saliency mechanism and Temporal Gaussian Mixture Model to emphasize the location and movement of task-relevant targets. We also design a mask token-based representation learning scheme to learn a more robust target-aware feature. Our framework ensures compact and effective feature representation for high-level vision tasks in low-lit settings. Extensive experiments conducted on CODaN, ExDark, and ARID datasets validate the effectiveness of our approach for a variety of image and video-based tasks, including classification, detection, and action recognition. Our code is available at https://github.com/ZhangYh994/WiiD.

Cite

Text

Li et al. "Watching It in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination." Proceedings of the European Conference on Computer Vision (ECCV), 2024. doi:10.1007/978-3-031-73226-3_3

Markdown

[Li et al. "Watching It in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination." Proceedings of the European Conference on Computer Vision (ECCV), 2024.](https://mlanthology.org/eccv/2024/li2024eccv-watching/) doi:10.1007/978-3-031-73226-3_3

BibTeX

@inproceedings{li2024eccv-watching,
  title     = {{Watching It in Dark: A Target-Aware Representation Learning Framework for High-Level Vision Tasks in Low Illumination}},
  author    = {Li, Yunan and Zhang, Yihao and Li, Shoude and Tian, Long and Quan, Dou and Li, Chaoneng and Miao, Qiguang},
  booktitle = {Proceedings of the European Conference on Computer Vision (ECCV)},
  year      = {2024},
  doi       = {10.1007/978-3-031-73226-3_3},
  url       = {https://mlanthology.org/eccv/2024/li2024eccv-watching/}
}