Step-Unrolled Denoising Autoencoders for Text Generation

Abstract

In this paper we propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE), that does not rely on autoregressive models. Similarly to denoising diffusion techniques, SUNDAE is repeatedly applied on a sequence of tokens, starting from random inputs and improving them each time until convergence. We present a simple new improvement operator that converges in fewer iterations than diffusion methods, while qualitatively producing better samples on natural language datasets. SUNDAE achieves state-of-the-art results (among non-autoregressive methods) on the WMT'14 English-to-German translation task and good qualitative results on unconditional language modeling on the Colossal Cleaned Common Crawl dataset and a dataset of Python code from GitHub. The non-autoregressive nature of SUNDAE opens up possibilities beyond left-to-right prompted generation, by filling in arbitrary blank patterns in a template.

Cite

Text

Savinov et al. "Step-Unrolled Denoising Autoencoders for Text Generation." International Conference on Learning Representations, 2022.

Markdown

[Savinov et al. "Step-Unrolled Denoising Autoencoders for Text Generation." International Conference on Learning Representations, 2022.](https://mlanthology.org/iclr/2022/savinov2022iclr-stepunrolled/)

BibTeX

@inproceedings{savinov2022iclr-stepunrolled,
  title     = {{Step-Unrolled Denoising Autoencoders for Text Generation}},
  author    = {Savinov, Nikolay and Chung, Junyoung and Binkowski, Mikolaj and Elsen, Erich and van den Oord, Aaron},
  booktitle = {International Conference on Learning Representations},
  year      = {2022},
  url       = {https://mlanthology.org/iclr/2022/savinov2022iclr-stepunrolled/}
}