Collaborative Filtering with Information-Rich and Information-Sparse Entities

Abstract

In this paper, we consider a popular model for collaborative filtering in recommender systems. In particular, we consider both the clustering model, where only users (or items) are clustered, and the co-clustering model, where both users and items are clustered, and further, we assume that some users rate many items (information-rich users) and some users rate only a few items (information-sparse users). When users (or items) are clustered, our algorithm can recover the rating matrix with ω(MKlogM)\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$\omega (MK \log M)$\end{document} noisy entries while MK\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$MK$\end{document} entries are necessary, where K\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K$\end{document} is the number of clusters and M\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$M$\end{document} is the number of items. In the case of co-clustering, we prove that K2\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K^2$\end{document} entries are necessary for recovering the rating matrix, and our algorithm achieves this lower bound within a logarithmic factor when K\documentclass[12pt]minimal \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}-69pt \begin{document}$K$\end{document} is sufficiently large. Extensive simulations on Netflix and MovieLens data show that our algorithm outperforms the alternating minimization and the popularity-among-friends algorithm. The performance difference increases even more when noise is added to the datasets.

Cite

Text

Zhu et al. "Collaborative Filtering with Information-Rich and Information-Sparse Entities." Machine Learning, 2014. doi:10.1007/S10994-014-5454-Z

Markdown

[Zhu et al. "Collaborative Filtering with Information-Rich and Information-Sparse Entities." Machine Learning, 2014.](https://mlanthology.org/mlj/2014/zhu2014mlj-collaborative/) doi:10.1007/S10994-014-5454-Z

BibTeX

@article{zhu2014mlj-collaborative,
  title     = {{Collaborative Filtering with Information-Rich and Information-Sparse Entities}},
  author    = {Zhu, Kai and Wu, Rui and Ying, Lei and Srikant, R.},
  journal   = {Machine Learning},
  year      = {2014},
  pages     = {177-203},
  doi       = {10.1007/S10994-014-5454-Z},
  volume    = {97},
  url       = {https://mlanthology.org/mlj/2014/zhu2014mlj-collaborative/}
}