NexusRaven: A Commercially-Permissive Language Model for Function Calling

Abstract

The rise of open-source, commercially permissive large language models (LLMs) is revolutionizing generative AI, presenting organizations with enhanced control, minimized data risks, and cost benefits compared to proprietary models. However, in the field of tool use and function-calling LLMs, many open-source models, such as Gorilla and ToolLLAMA, are dependent on proprietary LLMs like GPT-4 for high-quality training data, which often faces legal restrictions for competitive commercial applications. In this paper, we introduce NexusRaven-13B, an open-source LLM designed for function calls. Originating from the CodeLLAMA-13B lineage, NexusRaven-13B employs a unique data curation via multi-step refinement, ensuring high-quality training data without relying on GPT-4 distillation. NexusRaven-13B matches GPT-3.5 in zero-shot function-calling accuracy. When combined with our second core technique, demonstration retrieval augmentation, its performance significantly surpasses GPT-4. The code, model, and demo will be available after the review process.

Cite

Text

Srinivasan et al. "NexusRaven: A Commercially-Permissive Language Model for Function Calling." NeurIPS 2023 Workshops: Instruction, 2023.

Markdown

[Srinivasan et al. "NexusRaven: A Commercially-Permissive Language Model for Function Calling." NeurIPS 2023 Workshops: Instruction, 2023.](https://mlanthology.org/neuripsw/2023/srinivasan2023neuripsw-nexusraven-a/)

BibTeX

@inproceedings{srinivasan2023neuripsw-nexusraven-a,
  title     = {{NexusRaven: A Commercially-Permissive Language Model for Function Calling}},
  author    = {Srinivasan, Venkat Krishna and Dong, Zhen and Zhu, Banghua and Yu, Brian and Mao, Hanzi and Mosk-Aoyama, Damon and Keutzer, Kurt and Jiao, Jiantao and Zhang, Jian},
  booktitle = {NeurIPS 2023 Workshops: Instruction},
  year      = {2023},
  url       = {https://mlanthology.org/neuripsw/2023/srinivasan2023neuripsw-nexusraven-a/}
}