MSc-SQL: Multi-Sample Critiquing Small Language Models for Text-to-SQL Translation

Abstract

Text-to-SQL generation enables non-experts to interact with databases via natural language. Recent advances rely on large closed-source models like GPT-4 that present challenges in accessibility, privacy, and latency. To address these issues, we focus on developing small, efficient, and open-source text-to-SQL models. We demonstrate the benefits of sampling multiple candidate SQL generations and propose our method, MSc-SQL, to critique them using associated metadata. Our sample critiquing model evaluates multiple outputs simultaneously, achieving state-of-the-art performance compared to other open-source models while remaining competitive with larger models at a much lower cost. Full code can be found at github.com/layer6ai-labs/msc-sql.

Cite

Text

Gorti et al. "MSc-SQL: Multi-Sample Critiquing Small Language Models for Text-to-SQL Translation." NeurIPS 2024 Workshops: TRL, 2024.

Markdown

[Gorti et al. "MSc-SQL: Multi-Sample Critiquing Small Language Models for Text-to-SQL Translation." NeurIPS 2024 Workshops: TRL, 2024.](https://mlanthology.org/neuripsw/2024/gorti2024neuripsw-mscsql/)

BibTeX

@inproceedings{gorti2024neuripsw-mscsql,
  title     = {{MSc-SQL: Multi-Sample Critiquing Small Language Models for Text-to-SQL Translation}},
  author    = {Gorti, Satya Krishna and Gofman, Ilan and Liu, Zhaoyan and Wu, Jiapeng and Vouitsis, Noël and Yu, Guangwei and Cresswell, Jesse C. and Hosseinzadeh, Rasa},
  booktitle = {NeurIPS 2024 Workshops: TRL},
  year      = {2024},
  url       = {https://mlanthology.org/neuripsw/2024/gorti2024neuripsw-mscsql/}
}