Medoids in Almost-Linear Time via Multi-Armed Bandits
Abstract
Computing the medoid of a large number of points in high-dimensional space is an increasingly common operation in many data science problems. We present an algorithm Med-dit to compute the medoid with high probability, which uses $O(n\log n)$ distance evaluations. Med-dit is based on a connection with the Multi-Armed Bandit problem. We evaluate the performance of Med-dit empirically on the Netflix-prize and single-cell RNA-seq datasets, containing hundreds of thousands of points living in tens of thousands of dimensions, and observe a $5$-$10$x improvement in performance over the current state of the art. We have released the code of Med-dit and our empirical results at https://github.com/bagavi/Meddit.
Cite
Text
Bagaria et al. "Medoids in Almost-Linear Time via Multi-Armed Bandits." International Conference on Artificial Intelligence and Statistics, 2018.Markdown
[Bagaria et al. "Medoids in Almost-Linear Time via Multi-Armed Bandits." International Conference on Artificial Intelligence and Statistics, 2018.](https://mlanthology.org/aistats/2018/bagaria2018aistats-medoids/)BibTeX
@inproceedings{bagaria2018aistats-medoids,
title = {{Medoids in Almost-Linear Time via Multi-Armed Bandits}},
author = {Bagaria, Vivek Kumar and Kamath, Govinda M. and Ntranos, Vasilis and Zhang, Martin J. and Tse, David},
booktitle = {International Conference on Artificial Intelligence and Statistics},
year = {2018},
pages = {500-509},
url = {https://mlanthology.org/aistats/2018/bagaria2018aistats-medoids/}
}