Confidence-Weighted Linear Classification for Text Categorization

Abstract

Confidence-weighted online learning is a generalization of margin-based learning of linear classifiers in which the margin constraint is replaced by a probabilistic constraint based on a distribution over classifier weights that is updated online as examples are observed. The distribution captures a notion of confidence on classifier weights, and in some cases it can also be interpreted as replacing a single learning rate by adaptive per-weight rates. Confidence-weighted learning was motivated by the statistical properties of natural-language classification tasks, where most of the informative features are relatively rare. We investigate several versions of confidence-weighted learning that use a Gaussian distribution over weight vectors, updated at each observed example to achieve high probability of correct classification for the example. Empirical evaluation on a range of text-categorization tasks show that our algorithms improve over other state-of-the-art online and batch methods, learn faster in the online setting, and lead to better classifier combination for a type of distributed training commonly used in cloud computing.

Cite

Text

Crammer et al. "Confidence-Weighted Linear Classification for Text Categorization." Journal of Machine Learning Research, 2012.

Markdown

[Crammer et al. "Confidence-Weighted Linear Classification for Text Categorization." Journal of Machine Learning Research, 2012.](https://mlanthology.org/jmlr/2012/crammer2012jmlr-confidenceweighted/)

BibTeX

@article{crammer2012jmlr-confidenceweighted,
  title     = {{Confidence-Weighted Linear Classification for Text Categorization}},
  author    = {Crammer, Koby and Dredze, Mark and Pereira, Fernando},
  journal   = {Journal of Machine Learning Research},
  year      = {2012},
  pages     = {1891-1926},
  volume    = {13},
  url       = {https://mlanthology.org/jmlr/2012/crammer2012jmlr-confidenceweighted/}
}