Universal LLM Routing with Correctness-Based Representation

Wittawat Jitkrittum, Harikrishna Narasimhan, Ankit Singh Rawat, Jeevesh Juneja, Zifeng Wang, Chen-Yu Lee, Pradeep Shenoy, Rina Panigrahy, Aditya Krishna Menon, Sanjiv Kumar

ICLRW 2025

/iclrw/2025/jitkrittum2025iclrw-universal/

Abstract

Large language models’ significant advances in capabilities are accompanied by significant increases in inference costs. Model routing is a simple technique for reducing inference cost, wherein one maintains a pool of candidate LLMs, and learns to route each prompt to the smallest feasible LLM. Existing works focus on learning a router for a fixed pool of LLMs. In this paper, we consider the problem of dynamic routing, where new, previously unobserved LLMs are available at test time. We propose a new approach to this problem that relies on representing each LLM as a feature vector, derived based on predictions on a set of representative prompts. Based on this, we detail an effective strategy relying on cluster-based routing. We prove that the strategy is an estimate of a theoretically optimal routing rule. Experiments on a range of public benchmarks show the effectiveness of the proposal in routing amongst more than 30 unseen LLMs.

PDF ICLRW OpenReview Semantic Scholar

Cite

Text

Jitkrittum et al. "Universal LLM Routing with Correctness-Based Representation." ICLR 2025 Workshops: SCOPE, 2025.

Markdown

[Jitkrittum et al. "Universal LLM Routing with Correctness-Based Representation." ICLR 2025 Workshops: SCOPE, 2025.](https://mlanthology.org/iclrw/2025/jitkrittum2025iclrw-universal/)

BibTeX

@inproceedings{jitkrittum2025iclrw-universal,
  title     = {{Universal LLM Routing with Correctness-Based Representation}},
  author    = {Jitkrittum, Wittawat and Narasimhan, Harikrishna and Rawat, Ankit Singh and Juneja, Jeevesh and Wang, Zifeng and Lee, Chen-Yu and Shenoy, Pradeep and Panigrahy, Rina and Menon, Aditya Krishna and Kumar, Sanjiv},
  booktitle = {ICLR 2025 Workshops: SCOPE},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/jitkrittum2025iclrw-universal/}
}