GoalRank: Group-Relative Optimization for a Large Ranking Model

Zhang, Kaike; Wang, Xiaobei; Liu, Shuchang; HailanYang,; Li, Xiang; Hu, Lantao; Li, Han; Cao, Qi; Sun, Fei; Gai, Kun

GoalRank: Group-Relative Optimization for a Large Ranking Model

Kaike Zhang, Xiaobei Wang, Shuchang Liu, HailanYang, Xiang Li, Lantao Hu, Han Li, Qi Cao, Fei Sun, Kun Gai

ICLR 2026

/iclr/2026/zhang2026iclr-goalrank/

Abstract

Mainstream ranking approaches typically follow a Generator–Evaluator two-stage paradigm, where a generator produces candidate lists and an evaluator selects the best one. Recent work has attempted to enhance performance by expanding the number of candidate lists, for example, through multi-generator settings. However, ranking involves selecting a recommendation list from a combinatorially large space, simply enlarging the candidate set remains ineffective, and performance gains quickly saturate. At the same time, recent advances in large recommendation models have shown that end-to-end one-stage models can achieve promising performance with the expectation of scaling laws. Motivated by this, we revisit ranking from a generator-only one-stage perspective. We theoretically prove that, for any (finite Multi-)Generator–Evaluator model, there always exists a generator-only model that achieves strictly smaller approximation error to the optimal ranking policy, while also enjoying a scaling law as its size increases. Building on this result, we derive an evidence upper bound of the one-stage optimization objective, from which we find that one can leverage a reward model trained on real user feedback to construct a reference policy in a group-relative manner. This reference policy serves as a practical surrogate of the optimal policy, enabling effective training of a large generator-only ranker. Based on these insights, we propose GoalRank, a generator-only ranking framework. Extensive offline experiments on public benchmarks and large-scale online A/B tests demonstrate that GoalRank consistently outperforms state-of-the-art methods.

PDF ICLR OpenReview Semantic Scholar

Cite

Text

Zhang et al. "GoalRank: Group-Relative Optimization for a Large Ranking Model." International Conference on Learning Representations, 2026.

Markdown

[Zhang et al. "GoalRank: Group-Relative Optimization for a Large Ranking Model." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/zhang2026iclr-goalrank/)

BibTeX

@inproceedings{zhang2026iclr-goalrank,
  title     = {{GoalRank: Group-Relative Optimization for a Large Ranking Model}},
  author    = {Zhang, Kaike and Wang, Xiaobei and Liu, Shuchang and HailanYang,  and Li, Xiang and Hu, Lantao and Li, Han and Cao, Qi and Sun, Fei and Gai, Kun},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/zhang2026iclr-goalrank/}
}