Learning Distributed Representations with Efficient SoftMax Normalization

Abstract

Learning distributed representations, or embeddings, that encode the relational similarity patterns among objects is a relevant task in machine learning. A popular method to learn the embedding matrices $X, Y$ is optimizing a loss function of the term ${\rm SoftMax}(XY^T)$. The complexity required to calculate this term, however, runs quadratically with the problem size, making it a computationally heavy solution. In this article, we propose a linear-time heuristic approximation to compute the normalization constants of ${\rm SoftMax}(XY^T)$ for embedding vectors with bounded norms. We show on some pre-trained embedding datasets that the proposed estimation method achieves higher or comparable accuracy with competing methods. From this result, we design an efficient and task-agnostic algorithm that learns the embeddings by optimizing the cross entropy between the softmax and a set of probability distributions given as inputs. The proposed algorithm is interpretable and easily adapted to arbitrary embedding problems. We consider a few use cases and observe similar or higher performances and a lower computational time than similar ``2Vec'' algorithms.

Cite

Text

Dall'Amico and Belliardo. "Learning Distributed Representations with Efficient SoftMax Normalization." Transactions on Machine Learning Research, 2025.

Markdown

[Dall'Amico and Belliardo. "Learning Distributed Representations with Efficient SoftMax Normalization." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/dallamico2025tmlr-learning/)

BibTeX

@article{dallamico2025tmlr-learning,
  title     = {{Learning Distributed Representations with Efficient SoftMax Normalization}},
  author    = {Dall'Amico, Lorenzo and Belliardo, Enrico Maria},
  journal   = {Transactions on Machine Learning Research},
  year      = {2025},
  url       = {https://mlanthology.org/tmlr/2025/dallamico2025tmlr-learning/}
}