Variance Reduction of Stochastic Hypergradient Estimation by Mixed Fixed-Point Iteration
Abstract
Hypergradient represents how the hyperparameter of an optimization problem (or inner-problem) changes an outer-cost through the optimized inner-parameter, and it takes a crucial role in hyperparameter optimization, meta learning, and data influence estimation. This paper studies hypergradient computation involving a stochastic inner-problem, a typical machine learning setting where the empirical loss is estimated by minibatches. Stochastic hypergradient estimation requires estimating products of Jacobian matrices of the inner iteration. Current methods struggle with large estimation variance because they depend on a specific sequence of Jacobian samples to estimate this product. This paper overcomes this problem by \emph{mixing} two different stochastic hypergradient estimation methods that use distinct sequences of Jacobian samples. Furthermore, we show that the proposed method enables almost sure convergence to the true hypergradient through the stochastic Krasnosel'ski\u{\i}-Mann iteration. Theoretical analysis demonstrates that, compared to existing approaches, our method achieves lower asymptotic variance bounds while maintaining comparable computational complexity. Empirical evaluations on synthetic and real-world tasks verify our theoretical results and superior variance reduction over existing methods.
Cite
Text
Terashita and Hara. "Variance Reduction of Stochastic Hypergradient Estimation by Mixed Fixed-Point Iteration." Transactions on Machine Learning Research, 2025.Markdown
[Terashita and Hara. "Variance Reduction of Stochastic Hypergradient Estimation by Mixed Fixed-Point Iteration." Transactions on Machine Learning Research, 2025.](https://mlanthology.org/tmlr/2025/terashita2025tmlr-variance/)BibTeX
@article{terashita2025tmlr-variance,
title = {{Variance Reduction of Stochastic Hypergradient Estimation by Mixed Fixed-Point Iteration}},
author = {Terashita, Naoyuki and Hara, Satoshi},
journal = {Transactions on Machine Learning Research},
year = {2025},
url = {https://mlanthology.org/tmlr/2025/terashita2025tmlr-variance/}
}