Predictive Inference with Weak Supervision

Abstract

The expense of acquiring labels in large-scale statistical machine learning makes partially and weakly-labeled data attractive, though it is not always apparent how to leverage such data for model fitting or validation. We present a methodology to bridge the gap between partial supervision and validation, developing a conformal prediction framework to provide valid predictive confidence sets---sets that cover a true label with a prescribed probability, independent of the underlying distribution---using weakly labeled data. To do so, we introduce a (necessary) new notion of coverage and predictive validity, then develop several application scenarios, providing efficient algorithms for classification and several large-scale structured prediction problems. We corroborate the hypothesis that the new coverage definition allows for tighter and more informative (but valid) confidence sets through several experiments.

Cite

Text

Cauchois et al. "Predictive Inference with Weak Supervision." Journal of Machine Learning Research, 2024.

Markdown

[Cauchois et al. "Predictive Inference with Weak Supervision." Journal of Machine Learning Research, 2024.](https://mlanthology.org/jmlr/2024/cauchois2024jmlr-predictive/)

BibTeX

@article{cauchois2024jmlr-predictive,
  title     = {{Predictive Inference with Weak Supervision}},
  author    = {Cauchois, Maxime and Gupta, Suyash and Ali, Alnur and Duchi, John C.},
  journal   = {Journal of Machine Learning Research},
  year      = {2024},
  pages     = {1-45},
  volume    = {25},
  url       = {https://mlanthology.org/jmlr/2024/cauchois2024jmlr-predictive/}
}