Bridging Code-Text Representation Gap Using Explanation
Abstract
This paper studies Code-Text Representation (CTR) learning, aiming to learn general-purpose representations that support downstream code/text applications such as code search, finding code matching textual queries. However, state-of-the-arts do not focus on matching the gap between code/text modalities. In this paper, we complement this gap by providing an intermediate representation, and view it as “explanation.” Our contribution is three fold: First, we propose four types of explanation utilization methods for CTR, and compare their effectiveness. Second, we show that using explanation as the model input is desirable. Third, we confirm that even automatically generated explanation can lead to a drastic performance gain. To the best of our knowledge, this is the first work to define and categorize code explanation, for enhancing code understanding/representation.
Cite
Text
Han et al. "Bridging Code-Text Representation Gap Using Explanation." Proceedings of The 13th Asian Conference on Machine Learning, 2021.Markdown
[Han et al. "Bridging Code-Text Representation Gap Using Explanation." Proceedings of The 13th Asian Conference on Machine Learning, 2021.](https://mlanthology.org/acml/2021/han2021acml-bridging/)BibTeX
@inproceedings{han2021acml-bridging,
title = {{Bridging Code-Text Representation Gap Using Explanation}},
author = {Han, Hojae and Lee, Youngwon and Kim, Minsoo and Seung-won, Hwang},
booktitle = {Proceedings of The 13th Asian Conference on Machine Learning},
year = {2021},
pages = {1033-1048},
volume = {157},
url = {https://mlanthology.org/acml/2021/han2021acml-bridging/}
}