CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL

Abstract

We present CHASE-SQL, a novel framework addressing large language model (LLM) performance challenges for Text-to-SQL tasks by leveraging multi-agent modeling and test-time compute for improved candidate generation and selection. CHASE-SQL uses LLMs to generate diverse SQL candidates with: (1) a divide-and-conquer approach to break down complex queries, (2) chain-of-thought reasoning based on query execution plans, and (3) instance-aware synthetic example generation for tailored few-shot demonstrations. A selection agent ranks candidates via pairwise comparisons using a fine-tuned binary selection LLM, offering robust performance. This framework improves SQL query quality and diversity, achieving state-of-the-art execution accuracy of 73.0% on the BIRD Text-to-SQL benchmark test set, topping the leaderboard at the time of submission.

Cite

Text

Pourreza et al. "CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL." International Conference on Learning Representations, 2025.

Markdown

[Pourreza et al. "CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL." International Conference on Learning Representations, 2025.](https://mlanthology.org/iclr/2025/pourreza2025iclr-chasesql/)

BibTeX

@inproceedings{pourreza2025iclr-chasesql,
  title     = {{CHASE-SQL: Multi-Path Reasoning and Preference Optimized Candidate Selection in Text-to-SQL}},
  author    = {Pourreza, Mohammadreza and Li, Hailong and Sun, Ruoxi and Chung, Yeounoh and Talaei, Shayan and Kakkar, Gaurav Tarlok and Gan, Yu and Saberi, Amin and Ozcan, Fatma and Arik, Sercan O},
  booktitle = {International Conference on Learning Representations},
  year      = {2025},
  url       = {https://mlanthology.org/iclr/2025/pourreza2025iclr-chasesql/}
}