Efficient Rematerialization for Deep Networks

Ravi Kumar, Manish Purohit, Zoya Svitkina, Erik Vee, Joshua Wang

NeurIPS 2019 pp. 15172-15181

/neurips/2019/kumar2019neurips-efficient/

Abstract

When training complex neural networks, memory usage can be an important bottleneck. The question of when to rematerialize, i.e., to recompute intermediate values rather than retaining them in memory, becomes critical to achieving the best time and space efficiency. In this work we consider the rematerialization problem and devise efficient algorithms that use structural characterizations of computation graphs---treewidth and pathwidth---to obtain provably efficient rematerialization schedules. Our experiments demonstrate the performance of these algorithms on many common deep learning models.

PDF NeurIPS Semantic Scholar

Cite

Text

Kumar et al. "Efficient Rematerialization for Deep Networks." Neural Information Processing Systems, 2019.

Markdown

[Kumar et al. "Efficient Rematerialization for Deep Networks." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/kumar2019neurips-efficient/)

BibTeX

@inproceedings{kumar2019neurips-efficient,
  title     = {{Efficient Rematerialization for Deep Networks}},
  author    = {Kumar, Ravi and Purohit, Manish and Svitkina, Zoya and Vee, Erik and Wang, Joshua},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {15172-15181},
  url       = {https://mlanthology.org/neurips/2019/kumar2019neurips-efficient/}
}