Supervised Dimensionality Reduction and Visualization Using Centroid-Encoder
Abstract
We propose a new tool for visualizing complex, and potentially large and high-dimensional, data sets called Centroid-Encoder (CE). The architecture of the Centroid-Encoder is similar to the autoencoder neural network but it has a modified target, i.e., the class centroid in the ambient space. As such, CE incorporates label information and performs a supervised data visualization. The training of CE is done in the usual way with a training set whose parameters are tuned using a validation set. The evaluation of the resulting CE visualization is performed on a sequestered test set where the generalization of the model is assessed both visually and quantitatively. We present a detailed comparative analysis of the method using a wide variety of data sets and techniques, both supervised and unsupervised, including NCA, non-linear NCA, t-distributed NCA, t-distributed MCML, supervised UMAP, supervised PCA, Colored Maximum Variance Unfolding, supervised Isomap, Parametric Embedding, supervised Neighbor Retrieval Visualizer, and Multiple Relational Embedding. An analysis of variance using PCA demonstrates that a non-linear preprocessing by the CE transformation of the data captures more variance than PCA by dimension.
Cite
Text
Ghosh and Kirby. "Supervised Dimensionality Reduction and Visualization Using Centroid-Encoder." Journal of Machine Learning Research, 2022.Markdown
[Ghosh and Kirby. "Supervised Dimensionality Reduction and Visualization Using Centroid-Encoder." Journal of Machine Learning Research, 2022.](https://mlanthology.org/jmlr/2022/ghosh2022jmlr-supervised/)BibTeX
@article{ghosh2022jmlr-supervised,
title = {{Supervised Dimensionality Reduction and Visualization Using Centroid-Encoder}},
author = {Ghosh, Tomojit and Kirby, Michael},
journal = {Journal of Machine Learning Research},
year = {2022},
pages = {1-34},
volume = {23},
url = {https://mlanthology.org/jmlr/2022/ghosh2022jmlr-supervised/}
}