Conformal Prediction Under Ambiguous Ground Truth
Abstract
Conformal Prediction (CP) allows to perform rigorous uncertainty quantification by constructing a prediction set $C(X)$ satisfying $\mathbb{P}(Y \in C(X))\geq 1-\alpha$ for a user-chosen $\alpha \in [0,1]$ by relying on calibration data $(X_1,Y_1),...,(X_n,Y_n)$ from $\mathbb{P}=\mathbb{P}^{X} \otimes \mathbb{P}^{Y|X}$. It is typically implicitly assumed that $\mathbb{P}^{Y|X}$ is the ``true'' posterior label distribution. However, in many real-world scenarios, the labels $Y_1,...,Y_n$ are obtained by aggregating expert opinions using a voting procedure, resulting in a one-hot distribution $\mathbb{P}_{\textup{vote}}^{Y|X}$. This is the case for most datasets, even well-known ones like ImageNet. For such ``voted'' labels, CP guarantees are thus w.r.t. $\mathbb{P}_{\textup{vote}}=\mathbb{P}^X \otimes \mathbb{P}_{\textup{vote}}^{Y|X}$ rather than the true distribution $\mathbb{P}$. In cases with unambiguous ground truth labels, the distinction between $\mathbb{P}_{\textup{vote}}$ and $\mathbb{P}$ is irrelevant. However, when experts do not agree because of ambiguous labels, approximating $\mathbb{P}^{Y|X}$ with a one-hot distribution $\mathbb{P}_{\textup{vote}}^{Y|X}$ ignores this uncertainty. In this paper, we propose to leverage expert opinions to approximate $\mathbb{P}^{Y|X}$ using a non-degenerate distribution $\mathbb{P}_{\textup{agg}}^{Y|X}$. We then develop \emph{Monte Carlo CP} procedures which provide guarantees w.r.t. $\mathbb{P}_{\textup{agg}}=\mathbb{P}^X \otimes \mathbb{P}_{\textup{agg}}^{Y|X}$ by sampling multiple synthetic pseudo-labels from $\mathbb{P}_{\textup{agg}}^{Y|X}$ for each calibration example $X_1,...,X_n$. In a case study of skin condition classification with significant disagreement among expert annotators, we show that applying CP w.r.t. $\mathbb{P}_{\textup{vote}}$ under-covers expert annotations: calibrated for $72\%$ coverage, it falls short by on average $10\%$; our Monte Carlo CP closes this gap both empirically and theoretically. We also extend Monte Carlo CP to multi-label classification and CP with calibration examples enriched through data augmentation.
Cite
Text
Stutz et al. "Conformal Prediction Under Ambiguous Ground Truth." Transactions on Machine Learning Research, 2023.Markdown
[Stutz et al. "Conformal Prediction Under Ambiguous Ground Truth." Transactions on Machine Learning Research, 2023.](https://mlanthology.org/tmlr/2023/stutz2023tmlr-conformal/)BibTeX
@article{stutz2023tmlr-conformal,
title = {{Conformal Prediction Under Ambiguous Ground Truth}},
author = {Stutz, David and Roy, Abhijit Guha and Matejovicova, Tatiana and Strachan, Patricia and Cemgil, Ali Taylan and Doucet, Arnaud},
journal = {Transactions on Machine Learning Research},
year = {2023},
url = {https://mlanthology.org/tmlr/2023/stutz2023tmlr-conformal/}
}