Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding

Abstract

Dialogue understanding tasks often necessitate abundant annotated data to achieve good performance and that presents challenges in low-resource settings. To alleviate this barrier, we explore few-shot data augmentation for dialogue understanding by prompting large pre-trained language models and present a novel approach that iterates on augmentation quality by applying weakly-supervised filters. We evaluate our methods on the emotion and act classification tasks in DailyDialog and the intent classification task in Facebook Multilingual Task-Oriented Dialogue. Models fine-tuned on our augmented data mixed with few-shot ground truth data are able to approach or surpass existing full-shot state-of-the-art performance on both datasets. For DailyDialog specifically, using 10% of the ground truth data we outperform the current state-of-the-art model which uses 100% of the data.

Cite

Text

Chen et al. "Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding." NeurIPS 2022 Workshops: SyntheticData4ML, 2022.

Markdown

[Chen et al. "Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding." NeurIPS 2022 Workshops: SyntheticData4ML, 2022.](https://mlanthology.org/neuripsw/2022/chen2022neuripsw-weakly/)

BibTeX

@inproceedings{chen2022neuripsw-weakly,
  title     = {{Weakly Supervised Data Augmentation Through Prompting for Dialogue Understanding}},
  author    = {Chen, Maximillian and Papangelis, Alexandros and Tao, Chenyang and Rosenbaum, Andy and Kim, Seokhwan and Liu, Yang and Yu, Zhou and Hakkani-Tur, Dilek},
  booktitle = {NeurIPS 2022 Workshops: SyntheticData4ML},
  year      = {2022},
  url       = {https://mlanthology.org/neuripsw/2022/chen2022neuripsw-weakly/}
}