Lambert Matrix Factorization
Abstract
Many data generating processes result in skewed data, which should be modeled by distributions that can capture the skewness. In this work we adopt the flexible family of Lambert W distributions that combine arbitrary standard distribution with specific nonlinear transformation to incorporate skewness. We describe how Lambert W distributions can be used in probabilistic programs by providing stable gradient-based inference, and demonstrate their use in matrix factorization. In particular, we focus in modeling logarithmically transformed count data. We analyze the weighted squared loss used by state-of-the-art word embedding models to learn interpretable representations from word co-occurrences and show that a generative model capturing the essential properties of those models can be built using Lambert W distributions.
Cite
Text
Klami et al. "Lambert Matrix Factorization." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018. doi:10.1007/978-3-030-10928-8_19Markdown
[Klami et al. "Lambert Matrix Factorization." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2018.](https://mlanthology.org/ecmlpkdd/2018/klami2018ecmlpkdd-lambert/) doi:10.1007/978-3-030-10928-8_19BibTeX
@inproceedings{klami2018ecmlpkdd-lambert,
title = {{Lambert Matrix Factorization}},
author = {Klami, Arto and Lagus, Jarkko and Sakaya, Joseph},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2018},
pages = {311-326},
doi = {10.1007/978-3-030-10928-8_19},
url = {https://mlanthology.org/ecmlpkdd/2018/klami2018ecmlpkdd-lambert/}
}