RaCT: Toward Amortized Ranking-Critical Training for Collaborative Filtering

Abstract

We investigate new methods for training collaborative filtering models based on actor-critic reinforcement learning, to more directly maximize ranking-based objective functions. Specifically, we train a critic network to approximate ranking-based metrics, and then update the actor network to directly optimize against the learned metrics. In contrast to traditional learning-to-rank methods that require re-running the optimization procedure for new lists, our critic-based method amortizes the scoring process with a neural network, and can directly provide the (approximate) ranking scores for new lists. We demonstrate the actor-critic's ability to significantly improve the performance of a variety of prediction models, and achieve better or comparable performance to a variety of strong baselines on three large-scale datasets.

Cite

Text

Lobel et al. "RaCT: Toward Amortized Ranking-Critical Training for Collaborative Filtering." International Conference on Learning Representations, 2020.

Markdown

[Lobel et al. "RaCT: Toward Amortized Ranking-Critical Training for Collaborative Filtering." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/lobel2020iclr-ract-a/)

BibTeX

@inproceedings{lobel2020iclr-ract-a,
  title     = {{RaCT: Toward Amortized Ranking-Critical Training for Collaborative Filtering}},
  author    = {Lobel, Sam and Li, Chunyuan and Gao, Jianfeng and Carin, Lawrence},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/lobel2020iclr-ract-a/}
}