GoalRank: Group-Relative Optimization for a Large Ranking Model
Abstract
Mainstream ranking approaches typically follow a Generator–Evaluator two-stage paradigm, where a generator produces candidate lists and an evaluator selects the best one. Recent work has attempted to enhance performance by expanding the number of candidate lists, for example, through multi-generator settings. However, ranking involves selecting a recommendation list from a combinatorially large space, simply enlarging the candidate set remains ineffective, and performance gains quickly saturate. At the same time, recent advances in large recommendation models have shown that end-to-end one-stage models can achieve promising performance with the expectation of scaling laws. Motivated by this, we revisit ranking from a generator-only one-stage perspective. We theoretically prove that, for any (finite Multi-)Generator–Evaluator model, there always exists a generator-only model that achieves strictly smaller approximation error to the optimal ranking policy, while also enjoying a scaling law as its size increases. Building on this result, we derive an evidence upper bound of the one-stage optimization objective, from which we find that one can leverage a reward model trained on real user feedback to construct a reference policy in a group-relative manner. This reference policy serves as a practical surrogate of the optimal policy, enabling effective training of a large generator-only ranker. Based on these insights, we propose GoalRank, a generator-only ranking framework. Extensive offline experiments on public benchmarks and large-scale online A/B tests demonstrate that GoalRank consistently outperforms state-of-the-art methods.
Cite
Text
Zhang et al. "GoalRank: Group-Relative Optimization for a Large Ranking Model." International Conference on Learning Representations, 2026.Markdown
[Zhang et al. "GoalRank: Group-Relative Optimization for a Large Ranking Model." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhang2026iclr-goalrank/)BibTeX
@inproceedings{zhang2026iclr-goalrank,
title = {{GoalRank: Group-Relative Optimization for a Large Ranking Model}},
author = {Zhang, Kaike and Wang, Xiaobei and Liu, Shuchang and HailanYang, and Li, Xiang and Hu, Lantao and Li, Han and Cao, Qi and Sun, Fei and Gai, Kun},
booktitle = {International Conference on Learning Representations},
year = {2026},
url = {https://mlanthology.org/iclr/2026/zhang2026iclr-goalrank/}
}