LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network
Abstract
Recovering sharp images from dual-pixel (DP) pairs with disparity-dependent blur is a challenging task. Existing blur map-based deblurring methods have demonstrated promising results. In this paper we propose to the best of our knowledge the first framework to introduce the contrastive language-image pre-training framework (CLIP) to achieve accurate blur map estimation from DP pairs unsupervisedly. To this end we first carefully design text prompts to enable CLIP to understand blur-related geometric prior knowledge from the DP pair. Then we propose a format to input stereo DP pair to the CLIP without any fine-tuning where the CLIP is pre-trained on monocular images. Given the estimated blur map we introduce a blur-prior attention block a blur-weighting loss and a blur-aware loss to recover the all-in-focus image. Our method achieves state-of-the-art performance in extensive experiments (see Fig. 1).
Cite
Text
Yang et al. "LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network." Conference on Computer Vision and Pattern Recognition, 2024. doi:10.1109/CVPR52733.2024.02273Markdown
[Yang et al. "LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network." Conference on Computer Vision and Pattern Recognition, 2024.](https://mlanthology.org/cvpr/2024/yang2024cvpr-ldp/) doi:10.1109/CVPR52733.2024.02273BibTeX
@inproceedings{yang2024cvpr-ldp,
title = {{LDP: Language-Driven Dual-Pixel Image Defocus Deblurring Network}},
author = {Yang, Hao and Pan, Liyuan and Yang, Yan and Hartley, Richard and Liu, Miaomiao},
booktitle = {Conference on Computer Vision and Pattern Recognition},
year = {2024},
pages = {24078-24087},
doi = {10.1109/CVPR52733.2024.02273},
url = {https://mlanthology.org/cvpr/2024/yang2024cvpr-ldp/}
}