A Least Squares Formulation for Canonical Correlation Analysis

Abstract

Canonical Correlation Analysis (CCA) is a well-known technique for finding the correlations between two sets of multi-dimensional variables. It projects both sets of variables into a lower-dimensional space in which they are maximally correlated. CCA is commonly applied for supervised dimensionality reduction, in which one of the multi-dimensional variables is derived from the class label. It has been shown that CCA can be formulated as a least squares problem in the binary-class case. However, their relationship in the more general setting remains unclear. In this paper, we show that, under a mild condition which tends to hold for high-dimensional data, CCA in multi-label classifications can be formulated as a least squares problem. Based on this equivalence relationship, we propose several CCA extensions including sparse CCA using 1-norm regularization. Experiments on multi-label data sets confirm the established equivalence relationship. Results also demonstrate the effectiveness of the proposed CCA extensions.

Cite

Text

Sun et al. "A Least Squares Formulation for Canonical Correlation Analysis." International Conference on Machine Learning, 2008. doi:10.1145/1390156.1390285

Markdown

[Sun et al. "A Least Squares Formulation for Canonical Correlation Analysis." International Conference on Machine Learning, 2008.](https://mlanthology.org/icml/2008/sun2008icml-least/) doi:10.1145/1390156.1390285

BibTeX

@inproceedings{sun2008icml-least,
  title     = {{A Least Squares Formulation for Canonical Correlation Analysis}},
  author    = {Sun, Liang and Ji, Shuiwang and Ye, Jieping},
  booktitle = {International Conference on Machine Learning},
  year      = {2008},
  pages     = {1024-1031},
  doi       = {10.1145/1390156.1390285},
  url       = {https://mlanthology.org/icml/2008/sun2008icml-least/}
}