Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training

Abstract

Most recently, there has been significant interest in learning contextual representations for various NLP tasks, by leveraging large scale text corpora to train powerful language models with self-supervised learning objectives, such as Masked Language Model (MLM). Based on a pilot study, we observe three issues of existing general-purpose language models when they are applied in the text-to-SQL semantic parsers: fail to detect the column mentions in the utterances, to infer the column mentions from the cell values, and to compose target SQL queries when they are complex. To mitigate these issues, we present a model pretraining framework, Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterance and table schemas, by leveraging generation models to generate high-quality pre-train data. GAP Model is trained on 2 million utterance-schema pairs and 30K utterance-schema-SQL triples, whose utterances are generated by generation models. Based on experimental results, neural semantic parsers that leverage GAP Model as a representation encoder obtain new state-of-the-art results on both Spider and Criteria-to-SQL benchmarks.

Cite

Text

Shi et al. "Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training." AAAI Conference on Artificial Intelligence, 2021. doi:10.1609/AAAI.V35I15.17627

Markdown

[Shi et al. "Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training." AAAI Conference on Artificial Intelligence, 2021.](https://mlanthology.org/aaai/2021/shi2021aaai-learning/) doi:10.1609/AAAI.V35I15.17627

BibTeX

@inproceedings{shi2021aaai-learning,
  title     = {{Learning Contextual Representations for Semantic Parsing with Generation-Augmented Pre-Training}},
  author    = {Shi, Peng and Ng, Patrick and Wang, Zhiguo and Zhu, Henghui and Li, Alexander Hanbo and Wang, Jun and dos Santos, Cícero Nogueira and Xiang, Bing},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2021},
  pages     = {13806-13814},
  doi       = {10.1609/AAAI.V35I15.17627},
  url       = {https://mlanthology.org/aaai/2021/shi2021aaai-learning/}
}