Using Distributionally Robust Optimization to Improve Robustness in Cancer Pathology
Abstract
Computer vision (CV) approaches applied to digital pathology have informed biological discovery and clinical decision-making. However, batch effects in images represent a major challenge to effective analysis. A CV model trained using Empirical Risk Minimization (ERM) risks learning batch-effects when they may align with the labels and serve as spurious correlates. The standard methods to circumvent learning such confounders include (i) application of image augmentation techniques and (ii) examination of the learning process by evaluating through external validation (e.g., unseen data coming from a comparable dataset collected at another hospital). The latter approach is data-hungry and the former, risks occluding biological signal. Here, we suggest two solutions from the Distributionally Robust Optimization (DRO) families. Our contributions are i) a DRO algorithm using abstention which is a slight variation over existing abstention-based DRO algorithms and ii) a group-DRO method where groups are defined as hospitals from which data are collected. We find that the model trained using abstention-based DRO outperforms a model trained using ERM by 9.9% F1 in identifying tumor vs. normal tissue in lung adenocarcinoma (LUAD) at the expense of coverage. Further, by examining the areas abstained by the model with a pathologist, we find that the model trained using a DRO method is more robust to heterogeneity and artifacts in the tissue. Together, we propose selecting models that are more robust to spurious features for translational discovery and clinical decision support.
Cite
Text
Hari et al. "Using Distributionally Robust Optimization to Improve Robustness in Cancer Pathology." NeurIPS 2021 Workshops: DistShift, 2021.Markdown
[Hari et al. "Using Distributionally Robust Optimization to Improve Robustness in Cancer Pathology." NeurIPS 2021 Workshops: DistShift, 2021.](https://mlanthology.org/neuripsw/2021/hari2021neuripsw-using/)BibTeX
@inproceedings{hari2021neuripsw-using,
title = {{Using Distributionally Robust Optimization to Improve Robustness in Cancer Pathology}},
author = {Hari, Surya Narayanan and Van Allen, Eliezer and Nyman, Jackson and Mehta, Nicita and Jiang, Bowen and Elmarakeby, Haitham and Dietlein, Felix and Rosenthal, Jacob and Sengupta, Eshna and Umeton, Renato and Chowdhury, Alexander},
booktitle = {NeurIPS 2021 Workshops: DistShift},
year = {2021},
url = {https://mlanthology.org/neuripsw/2021/hari2021neuripsw-using/}
}