Embedding Heterogeneous Data Using Statistical Models

Globerson, Amir; Chechik, Gal; Pereira, Fernando; Tishby, Naftali

Embedding Heterogeneous Data Using Statistical Models

Amir Globerson, Gal Chechik, Fernando Pereira, Naftali Tishby

AAAI 2006 pp. 1605-1608

/aaai/2006/globerson2006aaai-embedding/

Abstract

Embedding algorithms are a method for revealing low dimensional structure in complex data. Most embedding algorithms are designed to handle objects of a single type for which pairwise distances are specified. Here we describe a method for embedding objects of different types (such as authors and terms) into a single common Euclidean space based on their co-occurrence statistics. The joint distributions of the heterogenous objects are modeled as exponentials of squared Euclidean distances in a low-dimensional embedding space. This construction links the problem to convex optimization over positive semidefinite matrices. We quantify the performance of our method on two text datasets, and show that it consistently and significantly outperforms standard methods of statistical correspondence modeling, such as multidimensional scaling and correspondence analysis.

PDF AAAI Semantic Scholar

Cite

Text

Globerson et al. "Embedding Heterogeneous Data Using Statistical Models." AAAI Conference on Artificial Intelligence, 2006.

Markdown

[Globerson et al. "Embedding Heterogeneous Data Using Statistical Models." AAAI Conference on Artificial Intelligence, 2006.](https://mlanthology.org/aaai/2006/globerson2006aaai-embedding/)

BibTeX

@inproceedings{globerson2006aaai-embedding,
  title     = {{Embedding Heterogeneous Data Using Statistical Models}},
  author    = {Globerson, Amir and Chechik, Gal and Pereira, Fernando and Tishby, Naftali},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2006},
  pages     = {1605-1608},
  url       = {https://mlanthology.org/aaai/2006/globerson2006aaai-embedding/}
}