Probability-Density-Aware Semi-Supervised Learning
Abstract
In Semi-supervised learning(SSL), we always accept cluster assumption, assuming features in different high-density regions belong to other categories. However, it is always ignored by existing algorithms and needs mathematical explanations. This paper first proposes a theorem to statistically explain cluster assumption and prove that the probability density can significantly help to use the prior fully. A Probability-Density-Aware Measure(PM) is proposed based on the theorem to discern the similarity between neighbor points. The PM is deployed to improve Label Propagation and a new pseudo-labeling algorithm, the Probability-Density-Aware Label Propagation(PMLP), is proposed. We also prove that traditional first-order similarity pseudo-labeling could be viewed as a particular case of PMLP, which provides a comprehensive theoretical understanding of PMLP's superior performance. Extensive experiments demonstrate that PMLP achieves outstanding performance compared with other recent methods.
Cite
Text
Liu et al. "Probability-Density-Aware Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I18.34085Markdown
[Liu et al. "Probability-Density-Aware Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/liu2025aaai-probability/) doi:10.1609/AAAI.V39I18.34085BibTeX
@inproceedings{liu2025aaai-probability,
title = {{Probability-Density-Aware Semi-Supervised Learning}},
author = {Liu, Shuyang and Zheng, Ruiqiu and Shen, Yunhang and Yu, Zhou and Li, Ke and Sun, Xing and Lin, Shaohui},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2025},
pages = {18943-18950},
doi = {10.1609/AAAI.V39I18.34085},
url = {https://mlanthology.org/aaai/2025/liu2025aaai-probability/}
}