Parametric T-Distributed Stochastic Exemplar-Centered Embedding

Abstract

Parametric embedding methods such as parametric t-SNE (pt-SNE) have been widely adopted for data visualization and out-of-sample data embedding without further computationally expensive optimization or approximation. However, the performance of pt-SNE is highly sensitive to the hyper-parameter batch size due to conflicting optimization goals, and often produces dramatically different embeddings with different choices of user-defined perplexities. To effectively solve these issues, we present parametric t-distributed stochastic exemplar-centered embedding methods. Our strategy learns embedding parameters by comparing given data only with precomputed exemplars, resulting in a cost function with linear computational and memory complexity, which is further reduced by noise contrastive samples. Moreover, we propose a shallow embedding network with high-order feature interactions for data visualization, which is much easier to tune but produces comparable performance in contrast to a deep neural network employed by pt-SNE. We empirically demonstrate, using several benchmark datasets, that our proposed methods significantly outperform pt-SNE in terms of robustness, visual effects, and quantitative evaluations.

Cite

Text

Min et al. "Parametric T-Distributed Stochastic Exemplar-Centered Embedding." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018. doi:10.1007/978-3-030-10925-7_29

Markdown

[Min et al. "Parametric T-Distributed Stochastic Exemplar-Centered Embedding." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018.](https://mlanthology.org/ecmlpkdd/2018/min2018ecmlpkdd-parametric/) doi:10.1007/978-3-030-10925-7_29

BibTeX

@inproceedings{min2018ecmlpkdd-parametric,
  title     = {{Parametric T-Distributed Stochastic Exemplar-Centered Embedding}},
  author    = {Min, Martin Renqiang and Guo, Hongyu and Shen, Dinghan},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2018},
  pages     = {477-493},
  doi       = {10.1007/978-3-030-10925-7_29},
  url       = {https://mlanthology.org/ecmlpkdd/2018/min2018ecmlpkdd-parametric/}
}