Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity
Abstract
The few-shot problem is an urgent challenge for single-table text-to-SQL. Existing methods ignore the potential value of unlabeled data, and merely rely on a coarse-grained Meta-Learning (ML) algorithm that neglects the differences of column contributions to the optimization object. This paper proposes a Meta Self-Training text-to-SQL (MST-SQL) method to solve the problem. Specifically, MST-SQL is based on column-wise HydraNet and adopts self-training as an effective mechanism to learn from readily available unlabeled samples. During each epoch of training, it first predicts pseudo-labels for unlabeled samples and then leverages them to update the parameters. A fine-grained ML algorithm is used in updating, which weighs the contribution of columns by their specificity, in order to further improve the generalizability. Extensive experimental results on both open-domain and domain-specific benchmarks reveal that our MST-SQL has significant advantages in few-shot scenarios, and is also competitive in standard supervised settings.
Cite
Text
Guo et al. "Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity." International Joint Conference on Artificial Intelligence, 2022. doi:10.24963/IJCAI.2022/576Markdown
[Guo et al. "Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity." International Joint Conference on Artificial Intelligence, 2022.](https://mlanthology.org/ijcai/2022/guo2022ijcai-improving/) doi:10.24963/IJCAI.2022/576BibTeX
@inproceedings{guo2022ijcai-improving,
title = {{Improving Few-Shot Text-to-SQL with Meta Self-Training via Column Specificity}},
author = {Guo, Xinnan and Chen, Yongrui and Qi, Guilin and Wu, Tianxing and Xu, Hao},
booktitle = {International Joint Conference on Artificial Intelligence},
year = {2022},
pages = {4150-4156},
doi = {10.24963/IJCAI.2022/576},
url = {https://mlanthology.org/ijcai/2022/guo2022ijcai-improving/}
}