Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues

Abstract

Recent monocular metric depth estimation (MMDE) methods have made notable progress towards zero-shot generalization. However, they still exhibit a significant performance drop on out-of-distribution datasets. We address this limitation by injecting defocus blur cues at inference time into Marigold, a \textit{pre-trained} diffusion model for zero-shot, scale-invariant monocular depth estimation (MDE). Our method effectively turns Marigold into a metric depth predictor in a training-free manner. To incorporate defocus cues, we capture two images with a small and a large aperture from the same viewpoint. To recover metric depth, we then optimize the metric depth scaling parameters and the noise latents of Marigold at inference time using gradients from a loss function based on the defocus-blur image formation model. We compare our method against existing state-of-the-art zero-shot MMDE methods on a self-collected real dataset, showing quantitative and qualitative improvements.

Cite

Text

Talegaonkar et al. "Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues." Advances in Neural Information Processing Systems, 2025.

Markdown

[Talegaonkar et al. "Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/talegaonkar2025neurips-repurposing/)

BibTeX

@inproceedings{talegaonkar2025neurips-repurposing,
  title     = {{Repurposing Marigold for Zero-Shot Metric Depth Estimation via Defocus Blur Cues}},
  author    = {Talegaonkar, Chinmay and Suresh, Nikhil Gandudi and Novack, Zachary and Belhe, Yash and Nagasamudra, Priyanka and Antipa, Nicholas},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/talegaonkar2025neurips-repurposing/}
}