Fast Generation for Convolutional Autoregressive Models

Abstract

Convolutional autoregressive models have recently demonstrated state-of-the-art performance on a number of generation tasks. While fast, parallel training methods have been crucial for their success, generation is typically implemented in a naive fashion where redundant computations are unnecessarily repeated. This results in slow generation, making such models infeasible for production environments. In this work, we describe a method to speed up generation in convolutional autoregressive models. The key idea is to cache hidden states to avoid redundant computation. We apply our fast generation method to the Wavenet and PixelCNN++ models and achieve up to $21\times$ and $183\times$ speedups respectively.

Cite

Text

Ramachandran et al. "Fast Generation for Convolutional Autoregressive Models." International Conference on Learning Representations, 2017.

Markdown

[Ramachandran et al. "Fast Generation for Convolutional Autoregressive Models." International Conference on Learning Representations, 2017.](https://mlanthology.org/iclr/2017/ramachandran2017iclr-fast/)

BibTeX

@inproceedings{ramachandran2017iclr-fast,
  title     = {{Fast Generation for Convolutional Autoregressive Models}},
  author    = {Ramachandran, Prajit and Le Paine, Tom and Khorrami, Pooya and Babaeizadeh, Mohammad and Chang, Shiyu and Zhang, Yang and Hasegawa-Johnson, Mark A. and Campbell, Roy H. and Huang, Thomas S.},
  booktitle = {International Conference on Learning Representations},
  year      = {2017},
  url       = {https://mlanthology.org/iclr/2017/ramachandran2017iclr-fast/}
}