Convolutional Neural Networks over Tree Structures for Programming Language Processing

Abstract

Programming language processing (similar to natural language processing) is a hot research topic in the field of software engineering; it has also aroused growing interest in the artificial intelligence community. However, different from a natural language sentence, a program contains rich, explicit, and complicated structural information. Hence, traditional NLP models may be inappropriate for programs. In this paper, we propose a novel tree-based convolutional neural network (TBCNN) for programming language processing, in which a convolution kernel is designed over programs' abstract syntax trees to capture structural information. TBCNN is a generic architecture for programming language processing; our experiments show its effectiveness in two different program analysis tasks: classifying programs according to functionality, and detecting code snippets of certain patterns. TBCNN outperforms baseline methods, including several neural models for NLP.

Cite

Text

Mou et al. "Convolutional Neural Networks over Tree Structures for Programming Language Processing." AAAI Conference on Artificial Intelligence, 2016. doi:10.1609/AAAI.V30I1.10139

Markdown

[Mou et al. "Convolutional Neural Networks over Tree Structures for Programming Language Processing." AAAI Conference on Artificial Intelligence, 2016.](https://mlanthology.org/aaai/2016/mou2016aaai-convolutional/) doi:10.1609/AAAI.V30I1.10139

BibTeX

@inproceedings{mou2016aaai-convolutional,
  title     = {{Convolutional Neural Networks over Tree Structures for Programming Language Processing}},
  author    = {Mou, Lili and Li, Ge and Zhang, Lu and Wang, Tao and Jin, Zhi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2016},
  pages     = {1287-1293},
  doi       = {10.1609/AAAI.V30I1.10139},
  url       = {https://mlanthology.org/aaai/2016/mou2016aaai-convolutional/}
}