InverseCoder: Self-Improving Instruction-Tuned Code LLMs with Inverse-Instruct

Wu, Yutong; Huang, Di; Shi, Wenxuan; Wang, Wei; Pu, Yewen; Gao, Lingzhe; Liu, Shihao; Nan, Ziyuan; Yuan, Kaizhao; Zhang, Rui; Zhang, Xishan; Du, Zidong; Guo, Qi; Yin, Dawei; Hu, Xing; Chen, Yunji

doi:10.1609/AAAI.V39I24.34742

InverseCoder: Self-Improving Instruction-Tuned Code LLMs with Inverse-Instruct

Yutong Wu, Di Huang, Wenxuan Shi, Wei Wang, Yewen Pu, Lingzhe Gao, Shihao Liu, Ziyuan Nan, Kaizhao Yuan, Rui Zhang, Xishan Zhang, Zidong Du, Qi Guo, Dawei Yin, Xing Hu, Yunji Chen

AAAI 2025 pp. 25525-25533

doi:10.1609/AAAI.V39I24.34742 /aaai/2025/wu2025aaai-inversecoder/

Abstract

Recent advancements in open-source code large language models (LLMs) have been driven by fine-tuning on the data generated from powerful closed-source LLMs, which are expensive to obtain. This paper explores whether it is possible to use a fine-tuned open-source model to generate additional data to augment its instruction-tuning dataset. We make two observations: (1) A code snippet can serve as the response to different instructions. (2) Instruction-tuned code LLMs perform better at translating code into instructions than the reverse. Based on these observations, we propose Inverse-Instruct, a data augmentation technique that uses a fine-tuned LLM to generate additional instructions of code responses from its own training dataset. The additional instruction-response pairs are added to the original dataset, and a stronger code LLM can be obtained by fine-tuning on the augmented dataset. We empirically validate Inverse-Instruct on a range of open-source code models (e.g. CodeLlama-Python and DeepSeek-Coder) and benchmarks (e.g., HumanEval(+), MBPP(+), DS-1000 and MultiPL-E), showing it consistently improves the base models.

PDF AAAI Semantic Scholar

Cite

Text

Wu et al. "InverseCoder: Self-Improving Instruction-Tuned Code LLMs with Inverse-Instruct." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I24.34742

Markdown

[Wu et al. "InverseCoder: Self-Improving Instruction-Tuned Code LLMs with Inverse-Instruct." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/wu2025aaai-inversecoder/) doi:10.1609/AAAI.V39I24.34742

BibTeX

@inproceedings{wu2025aaai-inversecoder,
  title     = {{InverseCoder: Self-Improving Instruction-Tuned Code LLMs with Inverse-Instruct}},
  author    = {Wu, Yutong and Huang, Di and Shi, Wenxuan and Wang, Wei and Pu, Yewen and Gao, Lingzhe and Liu, Shihao and Nan, Ziyuan and Yuan, Kaizhao and Zhang, Rui and Zhang, Xishan and Du, Zidong and Guo, Qi and Yin, Dawei and Hu, Xing and Chen, Yunji},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {25525-25533},
  doi       = {10.1609/AAAI.V39I24.34742},
  url       = {https://mlanthology.org/aaai/2025/wu2025aaai-inversecoder/}
}