Optimization with Access to Auxiliary Information
Abstract
We investigate the fundamental optimization question of minimizing a \emph{target} function $f(x)$, whose gradients are expensive to compute or have limited availability, given access to some \emph{auxiliary} side function $h(x)$ whose gradients are cheap or more available. This formulation captures many settings of practical relevance, such as i) re-using batches in SGD, ii) transfer learning, iii) federated learning, iv) training with compressed models/dropout, etcetera. We propose two generic new algorithms that apply in all these settings; we also prove that we can benefit from this framework under the Hessian similarity assumption between the target and side information. A benefit is obtained when this similarity measure is small; we also show a potential benefit from stochasticity when the auxiliary noise is correlated with that of the target function.
Cite
Text
Chayti and Karimireddy. "Optimization with Access to Auxiliary Information." Transactions on Machine Learning Research, 2024.Markdown
[Chayti and Karimireddy. "Optimization with Access to Auxiliary Information." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/chayti2024tmlr-optimization/)BibTeX
@article{chayti2024tmlr-optimization,
title = {{Optimization with Access to Auxiliary Information}},
author = {Chayti, El Mahdi and Karimireddy, Sai Praneeth},
journal = {Transactions on Machine Learning Research},
year = {2024},
url = {https://mlanthology.org/tmlr/2024/chayti2024tmlr-optimization/}
}