Flattening a Hierarchical Clustering Through Active Learning

Fabio Vitale, Anand Rajagopalan, Claudio Gentile

NeurIPS 2019 pp. 15289-15299

/neurips/2019/vitale2019neurips-flattening/

Abstract

We investigate active learning by pairwise similarity over the leaves of trees originating from hierarchical clustering procedures. In the realizable setting, we provide a full characterization of the number of queries needed to achieve perfect reconstruction of the tree cut. In the non-realizable setting, we rely on known important-sampling procedures to obtain regret and query complexity bounds. Our algorithms come with theoretical guarantees on the statistical error and, more importantly, lend themselves to {\em linear-time} implementations in the relevant parameters of the problem. We discuss such implementations, prove running time guarantees for them, and present preliminary experiments on real-world datasets showing the compelling practical performance of our algorithms as compared to both passive learning and simple active learning baselines.

PDF NeurIPS Semantic Scholar

Cite

Text

Vitale et al. "Flattening a Hierarchical Clustering Through Active Learning." Neural Information Processing Systems, 2019.

Markdown

[Vitale et al. "Flattening a Hierarchical Clustering Through Active Learning." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/vitale2019neurips-flattening/)

BibTeX

@inproceedings{vitale2019neurips-flattening,
  title     = {{Flattening a Hierarchical Clustering Through Active Learning}},
  author    = {Vitale, Fabio and Rajagopalan, Anand and Gentile, Claudio},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {15289-15299},
  url       = {https://mlanthology.org/neurips/2019/vitale2019neurips-flattening/}
}