Enhancing Post-Treatment Visual Acuity Prediction with Multimodal Deep Learning on Small-Scale Clinical and OCT Datasets

Anderson, Matthew; Corona, Veronica; Stankiewicz, Agnieszka; Habib, Maged; Steel, David H.; Obara, Boguslaw

Enhancing Post-Treatment Visual Acuity Prediction with Multimodal Deep Learning on Small-Scale Clinical and OCT Datasets

Matthew Anderson, Veronica Corona, Agnieszka Stankiewicz, Maged Habib, David H. Steel, Boguslaw Obara

MIDL 2025

/midl/2025/anderson2025midl-enhancing/

Abstract

Predicting visual acuity (VA) outcomes after treatment in diabetic macular edema (DME) is crucial for optimizing patient management but remains challenging due to the heterogeneity of patient responses and the limited availability of comprehensive datasets. While existing predictive models have shown promise, their clinical deployment is hindered by their reliance on large training datasets that are often unavailable in real-world settings. We address this challenge by developing a multimodal deep learning framework specifically designed for small-scale clinical cohorts. Our approach integrates optical coherence tomography (OCT) images with carefully selected clinical parameters through a cross-modal fusion architecture that leverages attention mechanisms to enhance feature interaction and predictive accuracy. We validate our framework across two clinically distinct real-world cohorts: treatment-naïve patients ($n=35$) receiving intensive anti-VEGF therapy and chronically treated patients ($n=20$) receiving sustained-release corticosteroid implants. This approach achieves mean absolute errors in post-treatment VA prediction of $3.07 \pm 0.82$ and $4.20 \pm 2.79$ Early Treatment Diabetic Retinopathy Study (ETDRS) letters, respectively, falling within the acceptable range of clinical measurement variability and meeting thresholds for statistically significant visual change detection with $\geq90\%$ confidence. This work demonstrates that appropriately designed multimodal architectures can achieve clinically meaningful prediction accuracy even with limited datasets, offering a practical foundation for personalized DME management in typical clinical settings where large datasets are unavailable.

PDF MIDL OpenReview Semantic Scholar

Cite

Text

Anderson et al. "Enhancing Post-Treatment Visual Acuity Prediction with Multimodal Deep Learning on Small-Scale Clinical and OCT Datasets." Medical Imaging with Deep Learning, 2025.

Markdown

[Anderson et al. "Enhancing Post-Treatment Visual Acuity Prediction with Multimodal Deep Learning on Small-Scale Clinical and OCT Datasets." Medical Imaging with Deep Learning, 2025.](https://mlanthology.org/midl/2025/anderson2025midl-enhancing/)

BibTeX

@inproceedings{anderson2025midl-enhancing,
  title     = {{Enhancing Post-Treatment Visual Acuity Prediction with Multimodal Deep Learning on Small-Scale Clinical and OCT Datasets}},
  author    = {Anderson, Matthew and Corona, Veronica and Stankiewicz, Agnieszka and Habib, Maged and Steel, David H. and Obara, Boguslaw},
  booktitle = {Medical Imaging with Deep Learning},
  year      = {2025},
  url       = {https://mlanthology.org/midl/2025/anderson2025midl-enhancing/}
}