Almost Linear Time Density Level Set Estimation via DBSCAN

Abstract

In this work we focus on designing a fast algorithm for lambda-density level set estimation via DBSCAN clustering. Previous work (Jiang ICML’17, and Jang and Jiang ICML’19) shows that under some natural assumptions DBSCAN and its variant DBSCAN++ can be used to estimate the lambda-density level set with near-optimal Hausdorff distance, i.e., with rate O~(n^-1/(2 * beta+D)). However, to achieve this near-optimal rate, the current fastest DBSCAN algorithm needs near quadratic running time. This running time is not very practical for giant datasets. Usually when we are working with very large datasets we desire linear or almost linear time algorithms. With this motivation, in this work, we present a modified DBSCAN algorithm with near optimal Hausdorff distance for density level set estimation with O~(n) running time. In our empirical study, we show that our algorithm provides significant speedup over the previous algorithms, while achieving comparable solution quality.

Cite

Text

Esfandiari et al. "Almost Linear Time Density Level Set Estimation via DBSCAN." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16902

Markdown

[Esfandiari et al. "Almost Linear Time Density Level Set Estimation via DBSCAN." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/esfandiari2021aaai-almost/) doi:10.1609/AAAI.V35I8.16902

BibTeX

@inproceedings{esfandiari2021aaai-almost,
  title     = {{Almost Linear Time Density Level Set Estimation via DBSCAN}},
  author    = {Esfandiari, Hossein and Mirrokni, Vahab S. and Zhong, Peilin},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {7349-7357},
  doi       = {10.1609/AAAI.V35I8.16902},
  url       = {https://mlanthology.org/aaai/2021/esfandiari2021aaai-almost/}
}