Information Preserving Dimensionality Reduction

Abstract

Dimensionality reduction is a very common preprocessing approach in many machine learning tasks. The goal is to design data representations that on one hand reduce the dimension of the data (therefore allowing faster processing), and on the other hand aim to retain as much task-relevant information as possible. We look at generic dimensionality reduction approaches that do not rely on much task-specific prior knowledge. However, we focus on scenarios in which unlabeled samples are available and can be utilized for evaluating the usefulness of candidate data representations. We wish to provide some theoretical principles to help explain the success of certain dimensionality reduction techniques in classification prediction tasks, as well as to guide the choice of dimensionality reduction tool and parameters. Our analysis is based on formalizing the often implicit assumption that “similar instances are likely to have similar labels”. Our theoretical analysis is supported by experimental results.

Cite

Text

Kushagra and Ben-David. "Information Preserving Dimensionality Reduction." International Conference on Algorithmic Learning Theory, 2015. doi:10.1007/978-3-319-24486-0_16

Markdown

[Kushagra and Ben-David. "Information Preserving Dimensionality Reduction." International Conference on Algorithmic Learning Theory, 2015.](https://mlanthology.org/alt/2015/kushagra2015alt-information/) doi:10.1007/978-3-319-24486-0_16

BibTeX

@inproceedings{kushagra2015alt-information,
  title     = {{Information Preserving Dimensionality Reduction}},
  author    = {Kushagra, Shrinu and Ben-David, Shai},
  booktitle = {International Conference on Algorithmic Learning Theory},
  year      = {2015},
  pages     = {239-253},
  doi       = {10.1007/978-3-319-24486-0_16},
  url       = {https://mlanthology.org/alt/2015/kushagra2015alt-information/}
}