Are Pre-Trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction

Abstract

With the recent success and popularity of pre-trained language models (LMs) in natural language processing, there has been a rise in efforts to understand their inner workings. In line with such interest, we propose a novel method that assists us in investigating the extent to which pre-trained LMs capture the syntactic notion of constituency. Our method provides an effective way of extracting constituency trees from the pre-trained LMs without training. In addition, we report intriguing findings in the induced trees, including the fact that pre-trained LMs outperform other approaches in correctly demarcating adverb phrases in sentences.

Cite

Text

Kim et al. "Are Pre-Trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction." International Conference on Learning Representations, 2020.

Markdown

[Kim et al. "Are Pre-Trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction." International Conference on Learning Representations, 2020.](https://mlanthology.org/iclr/2020/kim2020iclr-pretrained/)

BibTeX

@inproceedings{kim2020iclr-pretrained,
  title     = {{Are Pre-Trained Language Models Aware of Phrases? Simple but Strong Baselines for Grammar Induction}},
  author    = {Kim, Taeuk and Choi, Jihun and Edmiston, Daniel and Lee, Sang-goo},
  booktitle = {International Conference on Learning Representations},
  year      = {2020},
  url       = {https://mlanthology.org/iclr/2020/kim2020iclr-pretrained/}
}