Localized Centering: Reducing Hubness in Large-Sample Data

Abstract

Hubness has been recently identified as a problematic phenomenon occurring in high-dimensional space. In this paper, we address a different type of hubness that occurs when the number of samples is large. We investigate the difference between the hubness in high-dimensional data and the one in large-sample data. One finding is that centering, which is known to reduce the former, does not work for the latter. We then propose a new hub-reduction method, called localized centering. It is an extension of centering, yet works effectively for both types of hubness. Using real-world datasets consisting of a large number of documents, we demonstrate that the proposed method improves the accuracy of k-nearest neighbor classification.

Cite

Text

Hara et al. "Localized Centering: Reducing Hubness in Large-Sample Data." AAAI Conference on Artificial Intelligence, 2015. doi:10.1609/AAAI.V29I1.9629

Markdown

[Hara et al. "Localized Centering: Reducing Hubness in Large-Sample Data." AAAI Conference on Artificial Intelligence, 2015.](https://mlanthology.org/aaai/2015/hara2015aaai-localized/) doi:10.1609/AAAI.V29I1.9629

BibTeX

@inproceedings{hara2015aaai-localized,
  title     = {{Localized Centering: Reducing Hubness in Large-Sample Data}},
  author    = {Hara, Kazuo and Suzuki, Ikumi and Shimbo, Masashi and Kobayashi, Kei and Fukumizu, Kenji and Radovanovic, Milos},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2015},
  pages     = {2645-2651},
  doi       = {10.1609/AAAI.V29I1.9629},
  url       = {https://mlanthology.org/aaai/2015/hara2015aaai-localized/}
}