Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification
Abstract
To cope with high annotation costs, training a classifier only from weakly supervised data has attracted a great deal of attention these days. Among various approaches, strengthening supervision from completely unsupervised classification is a promising direction, which typically employs class priors as the only supervision and trains a binary classifier from unlabeled (U) datasets. While existing risk-consistent methods are theoretically grounded with high flexibility, they can learn only from two U sets. In this paper, we propose a new approach for binary classification from $m$ U-sets for $m\ge2$. Our key idea is to consider an auxiliary classification task called surrogate set classification (SSC), which is aimed at predicting from which U set each observed sample is drawn. SSC can be solved by a standard (multi-class) classification method, and we use the SSC solution to obtain the final binary classifier through a certain linear-fractional transformation. We built our method in a flexible and efficient end-to-end deep learning framework and prove it to be classifier-consistent. Through experiments, we demonstrate the superiority of our proposed method over state-of-the-art methods.
Cite
Text
Lu et al. "Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification." International Conference on Machine Learning, 2021.Markdown
[Lu et al. "Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/lu2021icml-binary/)BibTeX
@inproceedings{lu2021icml-binary,
title = {{Binary Classification from Multiple Unlabeled Datasets via Surrogate Set Classification}},
author = {Lu, Nan and Lei, Shida and Niu, Gang and Sato, Issei and Sugiyama, Masashi},
booktitle = {International Conference on Machine Learning},
year = {2021},
pages = {7134-7144},
volume = {139},
url = {https://mlanthology.org/icml/2021/lu2021icml-binary/}
}