Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single
Abstract
We propose an evolution strategies-based algorithm for estimating gradients in unrolled computation graphs, called ES-Single. Similarly to the recently-proposed Persistent Evolution Strategies (PES), ES-Single is unbiased, and overcomes chaos arising from recursive function applications by smoothing the meta-loss landscape. ES-Single samples a single perturbation per particle, that is kept fixed over the course of an inner problem (e.g., perturbations are not re-sampled for each partial unroll). Compared to PES, ES-Single is simpler to implement and has lower variance: the variance of ES-Single is constant with respect to the number of truncated unrolls, removing a key barrier in applying ES to long inner problems using short truncations. We show that ES-Single is unbiased for quadratic inner problems, and demonstrate empirically that its variance can be substantially lower than that of PES. ES-Single consistently outperforms PES on a variety of tasks, including a synthetic benchmark task, hyperparameter optimization, training recurrent neural networks, and training learned optimizers.
Cite
Text
Vicol. "Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single." International Conference on Machine Learning, 2023.Markdown
[Vicol. "Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/vicol2023icml-lowvariance/)BibTeX
@inproceedings{vicol2023icml-lowvariance,
title = {{Low-Variance Gradient Estimation in Unrolled Computation Graphs with ES-Single}},
author = {Vicol, Paul},
booktitle = {International Conference on Machine Learning},
year = {2023},
pages = {35084-35119},
volume = {202},
url = {https://mlanthology.org/icml/2023/vicol2023icml-lowvariance/}
}