Almost Linear Time Density Level Set Estimation via DBSCAN
Abstract
In this work we focus on designing a fast algorithm for lambda-density level set estimation via DBSCAN clustering. Previous work (Jiang ICML’17, and Jang and Jiang ICML’19) shows that under some natural assumptions DBSCAN and its variant DBSCAN++ can be used to estimate the lambda-density level set with near-optimal Hausdorff distance, i.e., with rate O~(n^-1/(2 * beta+D)). However, to achieve this near-optimal rate, the current fastest DBSCAN algorithm needs near quadratic running time. This running time is not very practical for giant datasets. Usually when we are working with very large datasets we desire linear or almost linear time algorithms. With this motivation, in this work, we present a modified DBSCAN algorithm with near optimal Hausdorff distance for density level set estimation with O~(n) running time. In our empirical study, we show that our algorithm provides significant speedup over the previous algorithms, while achieving comparable solution quality.
Cite
Text
Esfandiari et al. "Almost Linear Time Density Level Set Estimation via DBSCAN." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I8.16902Markdown
[Esfandiari et al. "Almost Linear Time Density Level Set Estimation via DBSCAN." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/esfandiari2021aaai-almost/) doi:10.1609/AAAI.V35I8.16902BibTeX
@inproceedings{esfandiari2021aaai-almost,
title = {{Almost Linear Time Density Level Set Estimation via DBSCAN}},
author = {Esfandiari, Hossein and Mirrokni, Vahab S. and Zhong, Peilin},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2021},
pages = {7349-7357},
doi = {10.1609/AAAI.V35I8.16902},
url = {https://mlanthology.org/aaai/2021/esfandiari2021aaai-almost/}
}