Complexity Bounds for Batch Active Learning in Classification

Rolet, Philippe; Teytaud, Olivier

doi:10.1007/978-3-642-15939-8_19

Complexity Bounds for Batch Active Learning in Classification

Philippe Rolet, Olivier Teytaud

ECML-PKDD 2010 pp. 293-305

doi:10.1007/978-3-642-15939-8_19 /ecmlpkdd/2010/rolet2010ecmlpkdd-complexity/

Abstract

Active learning [1] is a branch of Machine Learning in which the learning algorithm, instead of being directly provided with pairs of problem instances and their solutions (their labels), is allowed to choose, from a set of unlabeled data, which instances to query. It is suited to settings where labeling instances is costly. This paper analyzes the speed-up of batch (parallel) active learning compared to sequential active learning (where instances are chosen 1 by 1): how faster can an algorithm become if it can query λ instances at once? There are two main contributions: proving lower and upper bounds on the possible gain, and illustrating them by experimenting on usual active learning algorithms. Roughly speaking, the speed-up is asymptotically logarithmic in the batch size ł (i.e. when ł→ ∞). However, for some classes of functions with finite VC-dimension V , a linear speed-up can be achieved until a batch size of V . Practically speaking, this means that parallelizing computations on an expensive-to-label problem which is suited to active learning is very beneficial until V simultaneous queries, and less interesting (yet still bringing improvement) afterwards.

PDF ECML-PKDD Semantic Scholar

Cite

Text

Rolet and Teytaud. "Complexity Bounds for Batch Active Learning in Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15939-8_19

Markdown

[Rolet and Teytaud. "Complexity Bounds for Batch Active Learning in Classification." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/rolet2010ecmlpkdd-complexity/) doi:10.1007/978-3-642-15939-8_19

BibTeX

@inproceedings{rolet2010ecmlpkdd-complexity,
  title     = {{Complexity Bounds for Batch Active Learning in Classification}},
  author    = {Rolet, Philippe and Teytaud, Olivier},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2010},
  pages     = {293-305},
  doi       = {10.1007/978-3-642-15939-8_19},
  url       = {https://mlanthology.org/ecmlpkdd/2010/rolet2010ecmlpkdd-complexity/}
}