Generalization Without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks

Abstract

Humans can understand and produce new utterances effortlessly, thanks to their compositional skills. Once a person learns the meaning of a new verb "dax," he or she can immediately understand the meaning of "dax twice" or "sing and dax." In this paper, we introduce the SCAN domain, consisting of a set of simple compositional navigation commands paired with the corresponding action sequences. We then test the zero-shot generalization capabilities of a variety of recurrent neural networks (RNNs) trained on SCAN with sequence-to-sequence methods. We find that RNNs can make successful zero-shot generalizations when the differences between training and test commands are small, so that they can apply "mix-and-match" strategies to solve the task. However, when generalization requires systematic compositional skills (as in the "dax" example above), RNNs fail spectacularly. We conclude with a proof-of-concept experiment in neural machine translation, suggesting that lack of systematicity might be partially responsible for neural networks’ notorious training data thirst.

Cite

Text

Lake and Baroni. "Generalization Without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks." International Conference on Machine Learning, 2018.

Markdown

[Lake and Baroni. "Generalization Without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks." International Conference on Machine Learning, 2018.](https://mlanthology.org/icml/2018/lake2018icml-generalization/)

BibTeX

@inproceedings{lake2018icml-generalization,
  title     = {{Generalization Without Systematicity: On the Compositional Skills of Sequence-to-Sequence Recurrent Networks}},
  author    = {Lake, Brenden and Baroni, Marco},
  booktitle = {International Conference on Machine Learning},
  year      = {2018},
  pages     = {2873-2882},
  volume    = {80},
  url       = {https://mlanthology.org/icml/2018/lake2018icml-generalization/}
}