RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models

Shuhao Chen, Weisen Jiang, Baijiong Lin, James T. Kwok, Yu Zhang

NeurIPS 2024

doi:10.52202/079017-2120 /neurips/2024/chen2024neurips-routerdc/

Abstract

Recent works show that assembling multiple off-the-shelf large language models (LLMs) can harness their complementary abilities. To achieve this, routing is a promising method, which learns a router to select the most suitable LLM for each query. However, existing routing models are ineffective when multiple LLMs perform well for a query. To address this problem, in this paper, we propose a method called query-based Router by Dual Contrastive learning (RouterDC). The RouterDC model, which consists of an encoder and LLM embeddings, is trained by two proposed contrastive losses (sample-LLM and sample-sample losses). Experimental results show that RouterDC is effective in assembling LLMs and largely outperforms individual top-performing LLMs as well as existing routing methods on both in-distribution (+2.76\%) and out-of-distribution (+1.90\%) tasks. The source code is available at https://github.com/shuhao02/RouterDC.

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Chen et al. "RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models." Neural Information Processing Systems, 2024. doi:10.52202/079017-2120

Markdown

[Chen et al. "RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/chen2024neurips-routerdc/) doi:10.52202/079017-2120

BibTeX

@inproceedings{chen2024neurips-routerdc,
  title     = {{RouterDC: Query-Based Router by Dual Contrastive Learning for Assembling Large Language Models}},
  author    = {Chen, Shuhao and Jiang, Weisen and Lin, Baijiong and Kwok, James T. and Zhang, Yu},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-2120},
  url       = {https://mlanthology.org/neurips/2024/chen2024neurips-routerdc/}
}