Scalable Meta-Learning via Mixed-Mode Differentiation

Iurii Kemaev, Dan A. Calian, Luisa M Zintgraf, Gregory Farquhar, Hado Van Hasselt

ICML 2025 pp. 29687-29705

/icml/2025/kemaev2025icml-scalable/

Abstract

Gradient-based bilevel optimisation is a powerful technique with applications in hyperparameter optimisation, task adaptation, algorithm discovery, meta-learning more broadly, and beyond. It often requires differentiating through the gradient-based optimisation process itself, leading to "gradient-of-a-gradient" calculations with computationally expensive second-order and mixed derivatives. While modern automatic differentiation libraries provide a convenient way to write programs for calculating these derivatives, they oftentimes cannot fully exploit the specific structure of these problems out-of-the-box, leading to suboptimal performance. In this paper, we analyse such cases and propose Mixed-Flow Meta-Gradients, or MixFlow-MG – a practical algorithm that uses mixed-mode differentiation to construct more efficient and scalable computational graphs yielding over 10x memory and up to 25% wall-clock time improvements over standard implementations in modern meta-learning setups.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Kemaev et al. "Scalable Meta-Learning via Mixed-Mode Differentiation." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Kemaev et al. "Scalable Meta-Learning via Mixed-Mode Differentiation." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/kemaev2025icml-scalable/)

BibTeX

@inproceedings{kemaev2025icml-scalable,
  title     = {{Scalable Meta-Learning via Mixed-Mode Differentiation}},
  author    = {Kemaev, Iurii and Calian, Dan A. and Zintgraf, Luisa M and Farquhar, Gregory and Van Hasselt, Hado},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {29687-29705},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/kemaev2025icml-scalable/}
}