Doubly Aggressive Selective Sampling Algorithms for Classification

Abstract

Online selective sampling algorithms learn to perform binary classification, and additionally they decided whether to ask, or query, for a label of any given example. We introduce two stochastic linear algorithms and analyze them in the worst-case mistake-bound framework. Even though stochastic, for some inputs, our algorithms query with probability 1 and make an update even if there is no mistake, yet the margin is small, hence they are doubly aggressive. We prove bounds in the worst-case settings, which may be lower than previous bounds in some settings. Experiments with 33 document classification datasets, some with 100Ks examples, show the superiority of doubly-aggressive algorithms both in performance and number of queries.

Cite

Text

Crammer. "Doubly Aggressive Selective Sampling Algorithms for Classification." International Conference on Artificial Intelligence and Statistics, 2014.

Markdown

[Crammer. "Doubly Aggressive Selective Sampling Algorithms for Classification." International Conference on Artificial Intelligence and Statistics, 2014.](https://mlanthology.org/aistats/2014/crammer2014aistats-doubly/)

BibTeX

@inproceedings{crammer2014aistats-doubly,
  title     = {{Doubly Aggressive Selective Sampling Algorithms for Classification}},
  author    = {Crammer, Koby},
  booktitle = {International Conference on Artificial Intelligence and Statistics},
  year      = {2014},
  pages     = {140-148},
  url       = {https://mlanthology.org/aistats/2014/crammer2014aistats-doubly/}
}