Probability-Density-Aware Semi-Supervised Learning

Abstract

In Semi-supervised learning(SSL), we always accept cluster assumption, assuming features in different high-density regions belong to other categories. However, it is always ignored by existing algorithms and needs mathematical explanations. This paper first proposes a theorem to statistically explain cluster assumption and prove that the probability density can significantly help to use the prior fully. A Probability-Density-Aware Measure(PM) is proposed based on the theorem to discern the similarity between neighbor points. The PM is deployed to improve Label Propagation and a new pseudo-labeling algorithm, the Probability-Density-Aware Label Propagation(PMLP), is proposed. We also prove that traditional first-order similarity pseudo-labeling could be viewed as a particular case of PMLP, which provides a comprehensive theoretical understanding of PMLP's superior performance. Extensive experiments demonstrate that PMLP achieves outstanding performance compared with other recent methods.

Cite

Text

Liu et al. "Probability-Density-Aware Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I18.34085

Markdown

[Liu et al. "Probability-Density-Aware Semi-Supervised Learning." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/liu2025aaai-probability/) doi:10.1609/AAAI.V39I18.34085

BibTeX

@inproceedings{liu2025aaai-probability,
  title     = {{Probability-Density-Aware Semi-Supervised Learning}},
  author    = {Liu, Shuyang and Zheng, Ruiqiu and Shen, Yunhang and Yu, Zhou and Li, Ke and Sun, Xing and Lin, Shaohui},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {18943-18950},
  doi       = {10.1609/AAAI.V39I18.34085},
  url       = {https://mlanthology.org/aaai/2025/liu2025aaai-probability/}
}