A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning
Abstract
Multi-label classification (MLC) allows complex dependencies among labels, making it more suitable to model many real-world problems. However, data annotation for training MLC models becomes much more labor-intensive due to the correlated (hence non-exclusive) labels and a potential large and sparse label space. We propose to conduct multi-label active learning (ML-AL) through a novel integrated Gaussian Process-Bayesian Bernoulli Mixture model (GP-B$^2$M) to accurately quantify a data sample's overall contribution to a correlated label space and choose the most informative samples for cost-effective annotation. In particular, the B$^2$M encodes label correlations using a Bayesian Bernoulli mixture of label clusters, where each mixture component corresponds to a global pattern of label correlations. To tackle highly sparse labels under AL, the B$^2$M is further integrated with a predictive GP to connect data features as an effective inductive bias and achieve a feature-component-label mapping. The GP predicts coefficients of mixture components that help to recover the final set of labels of a data sample. A novel auxiliary variable based variational inference algorithm is developed to tackle the non-conjugacy introduced along with the mapping process for efficient end-to-end posterior inference. The model also outputs a predictive distribution that provides both the label prediction and their correlations in the form of a label covariance matrix. A principled sampling function is designed accordingly to naturally capture both the feature uncertainty (through GP) and label covariance (through B$^2$M) for effective data sampling. Experiments on real-world multi-label datasets demonstrate the state-of-the-art AL performance of the proposed GP-B$^2$M model.
Cite
Text
Shi et al. "A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning." Neural Information Processing Systems, 2021.Markdown
[Shi et al. "A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/shi2021neurips-gaussian/)BibTeX
@inproceedings{shi2021neurips-gaussian,
title = {{A Gaussian Process-Bayesian Bernoulli Mixture Model for Multi-Label Active Learning}},
author = {Shi, Weishi and Yu, Dayou and Yu, Qi},
booktitle = {Neural Information Processing Systems},
year = {2021},
url = {https://mlanthology.org/neurips/2021/shi2021neurips-gaussian/}
}