Linear Classifiers Are Nearly Optimal When Hidden Variables Have Diverse Effects
Abstract
We analyze classification problems in which data is generated by a two-tiered random process. The class is generated first, then a layer of conditionally independent hidden variables, and finally the observed variables. For sources like this, the Bayes-optimal rule for predicting the class given the values of the observed variables is a two-layer neural network. We show that, if the hidden variables have non-negligible effects on many observed variables, a linear classifier approximates the error rate of the Bayes optimal classifier up to lower order terms. We also show that the hinge loss of a linear classifier is not much more than the Bayes error rate, which implies that an accurate linear classifier can be found efficiently.
Cite
Text
Bshouty and Long. "Linear Classifiers Are Nearly Optimal When Hidden Variables Have Diverse Effects." Machine Learning, 2012. doi:10.1007/S10994-011-5262-7Markdown
[Bshouty and Long. "Linear Classifiers Are Nearly Optimal When Hidden Variables Have Diverse Effects." Machine Learning, 2012.](https://mlanthology.org/mlj/2012/bshouty2012mlj-linear/) doi:10.1007/S10994-011-5262-7BibTeX
@article{bshouty2012mlj-linear,
title = {{Linear Classifiers Are Nearly Optimal When Hidden Variables Have Diverse Effects}},
author = {Bshouty, Nader H. and Long, Philip M.},
journal = {Machine Learning},
year = {2012},
pages = {209-231},
doi = {10.1007/S10994-011-5262-7},
volume = {86},
url = {https://mlanthology.org/mlj/2012/bshouty2012mlj-linear/}
}