Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data

Abstract

We study the behavior of the popular Laplacian Regularization method for Semi-Supervised Learning at the regime of a fixed number of labeled points but a large number of unlabeled points. We show that in $\R^d$, $d \geq 2$, the method is actually not well-posed, and as the number of unlabeled points increases the solution degenerates to a noninformative function. We also contrast the method with the Laplacian Eigenvector method, and discuss the ``smoothness assumptions associated with this alternate method.

Cite

Text

Nadler et al. "Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data." Neural Information Processing Systems, 2009.

Markdown

[Nadler et al. "Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data." Neural Information Processing Systems, 2009.](https://mlanthology.org/neurips/2009/nadler2009neurips-statistical/)

BibTeX

@inproceedings{nadler2009neurips-statistical,
  title     = {{Statistical Analysis of Semi-Supervised Learning: The Limit of Infinite Unlabelled Data}},
  author    = {Nadler, Boaz and Srebro, Nathan and Zhou, Xueyuan},
  booktitle = {Neural Information Processing Systems},
  year      = {2009},
  pages     = {1330-1338},
  url       = {https://mlanthology.org/neurips/2009/nadler2009neurips-statistical/}
}