Collaborative Filtering with Information-Rich and Information-Sparse Entities
Abstract
In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω(MKlogM)\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$\omega (MK \log M)$\end{document} noisy entries while MK\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$MK$\end{document} entries are necessary, where K\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K$\end{document} is the number of clusters and M\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$M$\end{document} is the number of items. In the case of co-clustering, we prove that K2\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K^2$\end{document} entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K$\end{document} is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.
Cite
Text
Zhu et al. "Collaborative Filtering with Information-Rich and Information-Sparse Entities." Machine Learning, 2014. doi:10.1007/S10994-014-5454-ZMarkdown
[Zhu et al. "Collaborative Filtering with Information-Rich and Information-Sparse Entities." Machine Learning, 2014.](https://mlanthology.org/mlj/2014/zhu2014mlj-collaborative/) doi:10.1007/S10994-014-5454-ZBibTeX
@article{zhu2014mlj-collaborative,
title = {{Collaborative Filtering with Information-Rich and Information-Sparse Entities}},
author = {Zhu, Kai and Wu, Rui and Ying, Lei and Srikant, R.},
journal = {Machine Learning},
year = {2014},
pages = {177-203},
doi = {10.1007/S10994-014-5454-Z},
volume = {97},
url = {https://mlanthology.org/mlj/2014/zhu2014mlj-collaborative/}
}