L1-Depth Revisited: A Robust Angle-Based Outlier Factor in High-Dimensional Space
Abstract
Angle-based outlier detection (ABOD) has been recently emerged as an effective method to detect outliers in high dimensions. Instead of examining neighborhoods as proximity-based concepts, ABOD assesses the broadness of angle spectrum of a point as an outlier factor. Despite being a parameter-free and robust measure in high-dimensional space, the exact solution of ABOD suffers from the cubic cost $O(n^3)$ regarding the data size n , hence cannot be used on large-scale data sets. In this work we present a conceptual relationship between the ABOD intuition and the L1-depth concept in statistics, one of the earliest methods used for detecting outliers. Deriving from this relationship, we propose to use L1-depth as a variant of angle-based outlier factors, since it only requires a quadratic computational time as proximity-based outlier factors. Empirically, L1-depth is competitive (often superior) to proximity-based and other proposed angle-based outlier factors on detecting high-dimensional outliers regarding both efficiency and accuracy. In order to avoid the quadratic computational time, we introduce a simple but efficient sampling method named SamDepth for estimating L1-depth measure. We also present theoretical analysis to guarantee the reliability of SamDepth. The empirical experiments on many real-world high-dimensional data sets demonstrate that SamDepth with $\sqrt{n}$ samples often achieves very competitive accuracy and runs several orders of magnitude faster than other proximity-based and ABOD competitors. Data related to this paper are available at: https://www.dropbox.com/s/nk7nqmwmdsatizs/Datasets.zip . Code related to this paper is available at: https://github.com/NinhPham/Outlier .
Cite
Text
Pham. "L1-Depth Revisited: A Robust Angle-Based Outlier Factor in High-Dimensional Space." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018. doi:10.1007/978-3-030-10925-7_7Markdown
[Pham. "L1-Depth Revisited: A Robust Angle-Based Outlier Factor in High-Dimensional Space." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018.](https://mlanthology.org/ecmlpkdd/2018/pham2018ecmlpkdd-l1depth/) doi:10.1007/978-3-030-10925-7_7BibTeX
@inproceedings{pham2018ecmlpkdd-l1depth,
title = {{L1-Depth Revisited: A Robust Angle-Based Outlier Factor in High-Dimensional Space}},
author = {Pham, Ninh},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2018},
pages = {105-121},
doi = {10.1007/978-3-030-10925-7_7},
url = {https://mlanthology.org/ecmlpkdd/2018/pham2018ecmlpkdd-l1depth/}
}