Pyramid Attention for Source Code Summarization

Abstract

This paper presents a multi-granularity method for source code summarization, which generates a concise functional description for the given code snippet. We notice that skilled programmers write and read source codes hierarchically and pay close attention to conceptual entities like statements, tokens, sub-tokens, and the mapping relations between them. The entities have specific emphasis according to their granularities, e.g., statements in coarse-granularity reveal the global logical semantics of code, and the sub-tokens in fine-granularity are more related to the textual semantics. Driven by this observation, we demonstrate that a multi-granularity formulation incorporating these conceptual entities benefit the code summarization task. Concretely, the source code is transformed into a pyramidal representation, and then a pyramid attention mechanism is applied for efficient feature aggregation among different hierarchies in it. We instantiate our multi-granularity method using the proposed pyramid attention and name it PA-former (Pyramid Attention transformer). We evaluated it on two source code summarization benchmarks where it surpasses the prior works and achieves new state-of-the-art results. Our code and data are available at https://github.com/leichainju/pa-former.

Cite

Text

Chai and Li. "Pyramid Attention for Source Code Summarization." Neural Information Processing Systems, 2022.

Markdown

[Chai and Li. "Pyramid Attention for Source Code Summarization." Neural Information Processing Systems, 2022.](https://mlanthology.org/neurips/2022/chai2022neurips-pyramid/)

BibTeX

@inproceedings{chai2022neurips-pyramid,
  title     = {{Pyramid Attention for Source Code Summarization}},
  author    = {Chai, Lei and Li, Ming},
  booktitle = {Neural Information Processing Systems},
  year      = {2022},
  url       = {https://mlanthology.org/neurips/2022/chai2022neurips-pyramid/}
}