Automatic Unsupervised Outlier Model Selection

Abstract

Given an unsupervised outlier detection task on a new dataset, how can we automatically select a good outlier detection algorithm and its hyperparameter(s) (collectively called a model)? In this work, we tackle the unsupervised outlier model selection (UOMS) problem, and propose MetaOD, a principled, data-driven approach to UOMS based on meta-learning. The UOMS problem is notoriously challenging, as compared to model selection for classification and clustering, since (i) model evaluation is infeasible due to the lack of hold-out data with labels, and (ii) model comparison is infeasible due to the lack of a universal objective function. MetaOD capitalizes on the performances of a large body of detection models on historical outlier detection benchmark datasets, and carries over this prior experience to automatically select an effective model to be employed on a new dataset without any labels, model evaluations or model comparisons. To capture task similarity within our meta-learning framework, we introduce specialized meta-features that quantify outlying characteristics of a dataset. Extensive experiments show that selecting a model by MetaOD significantly outperforms no model selection (e.g. always using the same popular model or the ensemble of many) as well as other meta-learning techniques that we tailored for UOMS. Moreover upon (meta-)training, MetaOD is extremely efficient at test time; selecting from a large pool of 300+ models takes less than 1 second for a new task. We open-source MetaOD and our meta-learning database for practical use and to foster further research on the UOMS problem.

Cite

Text

Zhao et al. "Automatic Unsupervised Outlier Model Selection." Neural Information Processing Systems, 2021.

Markdown

[Zhao et al. "Automatic Unsupervised Outlier Model Selection." Neural Information Processing Systems, 2021.](https://mlanthology.org/neurips/2021/zhao2021neurips-automatic/)

BibTeX

@inproceedings{zhao2021neurips-automatic,
  title     = {{Automatic Unsupervised Outlier Model Selection}},
  author    = {Zhao, Yue and Rossi, Ryan and Akoglu, Leman},
  booktitle = {Neural Information Processing Systems},
  year      = {2021},
  url       = {https://mlanthology.org/neurips/2021/zhao2021neurips-automatic/}
}