Text Generation with Diffusion Language Models: A Pre-Training Approach with Continuous Paragraph Denoise

Abstract

In this paper, we introduce a novel dIffusion language modEl pre-training framework for text generation, which we call GENIE. GENIE is a large-scale pre-trained diffusion language model that consists of an encoder and a diffusion-based decoder, which can generate text by gradually transforming a random noise sequence into a coherent text sequence. To pre-train GENIE on a large-scale language corpus, we design a new continuous paragraph denoise objective, which encourages the diffusion-decoder to reconstruct a clean text paragraph from a corrupted version, while preserving the semantic and syntactic coherence. We evaluate GENIE on four downstream text generation benchmarks, namely XSum, CNN/DailyMail, Gigaword, and CommonGen. Our experimental results show that GENIE achieves comparable performance with the state-of-the-art autoregressive models on these benchmarks, and generates more diverse text samples. The code and models of GENIE are available at https://github.com/microsoft/ProphetNet/tree/master/GENIE.

Cite

Text

Lin et al. "Text Generation with Diffusion Language Models: A Pre-Training Approach with Continuous Paragraph Denoise." International Conference on Machine Learning, 2023.

Markdown

[Lin et al. "Text Generation with Diffusion Language Models: A Pre-Training Approach with Continuous Paragraph Denoise." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/lin2023icml-text/)

BibTeX

@inproceedings{lin2023icml-text,
  title     = {{Text Generation with Diffusion Language Models: A Pre-Training Approach with Continuous Paragraph Denoise}},
  author    = {Lin, Zhenghao and Gong, Yeyun and Shen, Yelong and Wu, Tong and Fan, Zhihao and Lin, Chen and Duan, Nan and Chen, Weizhu},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {21051-21064},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/lin2023icml-text/}
}