Calibrating Video Watch-Time Predictions with Credible Prototype Alignment
Abstract
Accurately predicting user watch-time is crucial for enhancing user stickiness and retention in video recommendation systems. Existing watch-time prediction approaches typically involve transformations of watch-time labels for prediction and subsequent reversal, ignoring both the natural distribution properties of label and the instance representation confusion that results in inaccurate predictions. In this paper, we propose ProWTP, a two-stage method combining prototype learning and optimal transport for watch-time regression prediction, suitable for any deep recommendation model. Specifically, we observe that the watch-ratio (the ratio of watch-time to video duration) within the same duration bucket exhibits a multimodal distribution. To facilitate incorporation into models, we use a hierarchical vector quantised variational autoencoder (HVQ-VAE) to convert the continuous label distribution into a high-dimensional discrete distribution, serving as credible prototypes for calibrations. Based on this, ProWTP views the alignment between prototypes and instance representations as a Semi-relaxed Unbalanced Optimal Transport (SUOT) problem, where the marginal constraints of prototypes are relaxed. And the corresponding optimization problem is reformulated as a weighted Lasso problem for solution. Moreover, ProWTP introduces the assignment and compactness losses to encourage instances to cluster closely around their respective prototypes, thereby enhancing the prototype-level distinguishability. Finally, we conducted extensive offline experiments on two industrial datasets, demonstrating our consistent superiority in real-world application.
Cite
Text
Cui et al. "Calibrating Video Watch-Time Predictions with Credible Prototype Alignment." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Cui et al. "Calibrating Video Watch-Time Predictions with Credible Prototype Alignment." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/cui2025icml-calibrating/)BibTeX
@inproceedings{cui2025icml-calibrating,
title = {{Calibrating Video Watch-Time Predictions with Credible Prototype Alignment}},
author = {Cui, Chao and Tang, Shisong and Li, Fan and Gao, Jiechao and Chen, Hechang},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {11563-11584},
volume = {267},
url = {https://mlanthology.org/icml/2025/cui2025icml-calibrating/}
}