Let the LLM Stick to Its Strengths: Learning to Route Economical LLM

Yi-Kai Zhang, Shiyin Lu, Qing-Guo Chen, Weihua Luo, De-Chuan Zhan, Han-Jia Ye

NeurIPS 2025

/neurips/2025/zhang2025neurips-let/

Abstract

Recently, test-time scaling of Large Language Models (LLMs) has emerged as a practical alternative to parameter and data scaling. Reasoning tasks often require large-scale, RLVR-based LLMs, while more economical LLMs can handle simpler tasks. Routing an LLM tailored to *suitability* (*i.e.*, capability and cost) ensures usability and efficiency. We introduce LLMRec, which routes the most suitable LLM to the user query without pre-inference on the candidate LLM zoo. It pioneeringly reframes the LLM routing problem as a comprehensive recommendation system (RecSys) task. Our core insight is that an LLM's suitability for a query is a complex, latent signal equal to user-item preference. LLMRec systematically engineers features for candidate LLMs (intrinsic attributes and capability distributions), queries (general semantics and meta-dimensional info), and context (inference type, cost budgets). It also incorporates behavioral features to learn high-order interactions. LLMRec is designed to generalize to out-of-domain datasets and adapt to new LLMs as the model zoo evolves. We define the metric with the Pareto frontier under user-specified cost budgets. Across six datasets, LLMRec achieves an average cost reduction of over 38% while maintaining accuracy and consistently outperforming baselines in converging toward the Pareto frontier.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Zhang et al. "Let the LLM Stick to Its Strengths: Learning to Route Economical LLM." Advances in Neural Information Processing Systems, 2025.

Markdown

[Zhang et al. "Let the LLM Stick to Its Strengths: Learning to Route Economical LLM." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/zhang2025neurips-let/)

BibTeX

@inproceedings{zhang2025neurips-let,
  title     = {{Let the LLM Stick to Its Strengths: Learning to Route Economical LLM}},
  author    = {Zhang, Yi-Kai and Lu, Shiyin and Chen, Qing-Guo and Luo, Weihua and Zhan, De-Chuan and Ye, Han-Jia},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/zhang2025neurips-let/}
}