Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation

Abstract

In this paper, we introduce Kun, a novel approach for creating high-quality instruction-tuning datasets for large language models (LLMs) without relying on manual annotations. Adapting a self-training algorithm based on instruction back-translation and answer polishment, Kun leverages unlabelled data from diverse sources such as Wudao, Wanjuan, and SkyPile to generate a substantial dataset of over a million Chinese instructional data points. This approach presents a novel departure from traditional methods by using a self-curation process to refine and select the most effective instruction-output pairs. Our experiments with the 6B-parameter Yi model across various benchmarks demonstrate Kun's robustness and scalability. Our method's core contributions lie in its algorithmic advancement, which enhances data retention and clarity, and its innovative data generation approach that substantially reduces the reliance on costly and time-consuming manual annotations. This methodology presents a scalable and efficient solution for improving the instruction-following capabilities of LLMs, with significant implications for their application across diverse fields.

Cite

Text

Zheng et al. "Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation." ICLR 2025 Workshops: SSI-FM, 2025.

Markdown

[Zheng et al. "Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation." ICLR 2025 Workshops: SSI-FM, 2025.](https://mlanthology.org/iclrw/2025/zheng2025iclrw-kun/)

BibTeX

@inproceedings{zheng2025iclrw-kun,
  title     = {{Kun: Answer Polishment for Chinese Self-Alignment with Instruction Back-Translation}},
  author    = {Zheng, Tianyu and Guo, Shuyue and Qu, Xingwei and Guo, Jiawei and Du, Xeron and Lin, Chenghua and Huang, Stephen and Fu, Jie and Zhang, Ge},
  booktitle = {ICLR 2025 Workshops: SSI-FM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/zheng2025iclrw-kun/}
}