TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer

Abstract

The problem of fixing errors in programs has attracted substantial interest over the years. The key challenge for building an effective code fixing tool is to capture a wide range of errors and meanwhile maintain high accuracy. In this paper, we address this challenge and present a new learning-based system, called TFix. TFix works directly on program text and phrases the problem of code fixing as a text-to-text task. In turn, this enables it to leverage a powerful Transformer based model pre-trained on natural language and fine-tuned to generate code fixes (via a large, high-quality dataset obtained from GitHub commits). TFix is not specific to a particular programming language or class of defects and, in fact, improved its precision by simultaneously fine-tuning on 52 different error types reported by a popular static analyzer. Our evaluation on a massive dataset of JavaScript programs shows that TFix is practically effective: it is able to synthesize code that fixes the error in  67 percent of cases and significantly outperforms existing learning-based approaches.

Cite

Text

Berabi et al. "TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer." International Conference on Machine Learning, 2021.

Markdown

[Berabi et al. "TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer." International Conference on Machine Learning, 2021.](https://mlanthology.org/icml/2021/berabi2021icml-tfix/)

BibTeX

@inproceedings{berabi2021icml-tfix,
  title     = {{TFix: Learning to Fix Coding Errors with a Text-to-Text Transformer}},
  author    = {Berabi, Berkay and He, Jingxuan and Raychev, Veselin and Vechev, Martin},
  booktitle = {International Conference on Machine Learning},
  year      = {2021},
  pages     = {780-791},
  volume    = {139},
  url       = {https://mlanthology.org/icml/2021/berabi2021icml-tfix/}
}