Estimating the Support of a High-Dimensional Distribution

Abstract

Suppose you are given some data set drawn from an underlying probability distribution P and you want to estimate a “simple” subset S of input space such that the probability that a test point drawn from P lies outside of S equals some a priori specified value between 0 and 1. We propose a method to approach this problem by trying to estimate a function f that is positive on S and negative on the complement. The functional form of f is given by a kernel expansion in terms of a potentially small subset of the training data; it is regularized by controlling the length of the weight vector in an associated feature space. The expansion coefficients are found by solving a quadratic programming problem, which we do by carrying out sequential optimization over pairs of input patterns. We also provide a theoretical analysis of the statistical performance of our algorithm. The algorithm is a natural extension of the support vector algorithm to the case of unlabeled data.

Cite

Text

Schölkopf et al. "Estimating the Support of a High-Dimensional Distribution." Neural Computation, 2001. doi:10.1162/089976601750264965

Markdown

[Schölkopf et al. "Estimating the Support of a High-Dimensional Distribution." Neural Computation, 2001.](https://mlanthology.org/neco/2001/scholkopf2001neco-estimating/) doi:10.1162/089976601750264965

BibTeX

@article{scholkopf2001neco-estimating,
  title     = {{Estimating the Support of a High-Dimensional Distribution}},
  author    = {Schölkopf, Bernhard and Platt, John C. and Shawe-Taylor, John and Smola, Alexander J. and Williamson, Robert C.},
  journal   = {Neural Computation},
  year      = {2001},
  pages     = {1443-1471},
  doi       = {10.1162/089976601750264965},
  volume    = {13},
  url       = {https://mlanthology.org/neco/2001/scholkopf2001neco-estimating/}
}