Linear Concepts and Hidden Variables

Abstract

We study a learning problem which allows for a “fair” comparison between unsupervised learning methods—probabilistic model construction, and more traditional algorithms that directly learn a classification. The merits of each approach are intuitively clear: inducing a model is more expensive computationally, but may support a wider range of predictions. Its performance, however, will depend on how well the postulated probabilistic model fits that data. To compare the paradigms we consider a model which postulates a single binary-valued hidden variable on which all other attributes depend. In this model, finding the most likely value of any one variable (given known values for the others) reduces to testing a linear function of the observed values. We learn the model with two techniques: the standard EM algorithm, and a new algorithm we develop based on covariances. We compare these, in a controlled fashion, against an algorithm (a version of Winnow ) that attempts to find a good linear classifier directly. Our conclusions help delimit the fragility of using a model that is even “slightly” simpler than the distribution actually generating the data, vs. the relative robustness of directly searching for a good predictor.

Cite

Text

Grove and Roth. "Linear Concepts and Hidden Variables." Machine Learning, 2001. doi:10.1023/A:1007655119445

Markdown

[Grove and Roth. "Linear Concepts and Hidden Variables." Machine Learning, 2001.](https://mlanthology.org/mlj/2001/grove2001mlj-linear/) doi:10.1023/A:1007655119445

BibTeX

@article{grove2001mlj-linear,
  title     = {{Linear Concepts and Hidden Variables}},
  author    = {Grove, Adam J. and Roth, Dan},
  journal   = {Machine Learning},
  year      = {2001},
  pages     = {123-141},
  doi       = {10.1023/A:1007655119445},
  volume    = {42},
  url       = {https://mlanthology.org/mlj/2001/grove2001mlj-linear/}
}