LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network

Abstract

Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task. Existing blur map-based deblurring methods have demonstrated promising results. In this paper we propose to the best of our knowledge the first framework to introduce the contrastive language-image pre-training framework (CLIP) to achieve accurate blur map estimation from DP pairs unsupervisedly. To this end we first carefully design text prompts to enable CLIP to understand blur-related geometric prior knowledge from the DP pair. Then we propose a format to input stereo DP pair to the CLIP without any fine-tuning where the CLIP is pre-trained on monocular images. Given the estimated blur map we introduce a blur-prior attention block a blur-weighting loss and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig. 1).

Cite

Text

Yang et al. "LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02273

Markdown

[Yang et al. "LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/yang2024cvpr-ldp/) doi:10.1109/CVPR52733.2024.02273

BibTeX

@inproceedings{yang2024cvpr-ldp,
  title     = {{LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network}},
  author    = {Yang, Hao and Pan, Liyuan and Yang, Yan and Hartley, Richard and Liu, Miaomiao},
  booktitle = {Conference on Computer Vision and Pattern Recognition},
  year      = {2024},
  pages     = {24078-24087},
  doi       = {10.1109/CVPR52733.2024.02273},
  url       = {https://mlanthology.org/cvpr/2024/yang2024cvpr-ldp/}
}