The Computational Complexity of Densest Region Detection

Abstract

We investigate the computational complexity of the task of detecting dense regions of an unknown distribution from unlabeled samples of this distribution. We introduce a formal learning model for this task that uses a hypothesis class as it “anti-overfitting” mechanism. The learning task in our model can be reduced to a combinatorial optimization problem. We can show that for some constants, depending on the hypothesis class, these problems are NP-hard to approximate to within these constant factors. We go on and introduce a new criterion for the success of approximate optimization geometric problems. The new criterion requires that the algorithm competes with hypotheses only on the points that are separated by some margin ? from their boundaries. Quite surprisingly, we discover that for each of the two hypothesis classes that we investigate, there is a “critical value” of the margin parameter ?. For any value below the critical value the problems are NP-hard to approximate, while, once this value is exceeded, the problems become poly-time solvable.

Cite

Text

Ben-David et al. "The Computational Complexity of Densest Region Detection." Annual Conference on Computational Learning Theory, 2000. doi:10.1006/jcss.2001.1797

Markdown

[Ben-David et al. "The Computational Complexity of Densest Region Detection." Annual Conference on Computational Learning Theory, 2000.](https://mlanthology.org/colt/2000/bendavid2000colt-computational/) doi:10.1006/jcss.2001.1797

BibTeX

@inproceedings{bendavid2000colt-computational,
  title     = {{The Computational Complexity of Densest Region Detection}},
  author    = {Ben-David, Shai and Eiron, Nadav and Simon, Hans Ulrich},
  booktitle = {Annual Conference on Computational Learning Theory},
  year      = {2000},
  pages     = {255-265},
  doi       = {10.1006/jcss.2001.1797},
  url       = {https://mlanthology.org/colt/2000/bendavid2000colt-computational/}
}