Concentration Free Outlier Detection
Abstract
We present a novel notion of outlier, called Concentration Free Outlier Factor (CFOF), having the peculiarity to resist concentration phenomena that affect other scores when the dimensionality of the feature space increases. Indeed we formally prove that $\hbox {CFOF}$ does not concentrate in intrinsically high-dimensional spaces. Moreover, $\hbox {CFOF}$ is adaptive to different local density levels and it does not require the computation of exact neighbors in order to be reliably computed. We present a very efficient technique, named ${\textit{fast-}\hbox {CFOF}}$ , for detecting outliers in very large high-dimensional datasets. The technique is efficiently parallelizable, and we provide a MIMD-SIMD implementation. Experimental results witness for scalability and effectiveness of the technique and highlight that $\hbox {CFOF}$ exhibits state of the art detection performances.
Cite
Text
Angiulli. "Concentration Free Outlier Detection." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017. doi:10.1007/978-3-319-71249-9_1Markdown
[Angiulli. "Concentration Free Outlier Detection." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2017.](https://mlanthology.org/ecmlpkdd/2017/angiulli2017ecmlpkdd-concentration/) doi:10.1007/978-3-319-71249-9_1BibTeX
@inproceedings{angiulli2017ecmlpkdd-concentration,
title = {{Concentration Free Outlier Detection}},
author = {Angiulli, Fabrizio},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2017},
pages = {3-19},
doi = {10.1007/978-3-319-71249-9_1},
url = {https://mlanthology.org/ecmlpkdd/2017/angiulli2017ecmlpkdd-concentration/}
}