Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation

Abstract

Learning curve extrapolation predicts neural network performance from early training epochs and has been applied to accelerate AutoML, facilitating hyperparameter tuning and neural architecture search. However, existing methods typically model the evolution of learning curves in isolation, neglecting the impact of neural network (NN) architectures, which influence the loss landscape and learning trajectories. In this work, we explore whether incorporating neural network architecture improves learning curve modeling and how to effectively integrate this architectural information. Motivated by the dynamical system view of optimization, we propose a novel architecture-aware neural differential equation model to forecast learning curves continuously. We empirically demonstrate its ability to capture the general trend of fluctuating learning curves while quantifying uncertainty through variational parameters. Our model outperforms current state-of-the-art learning curve extrapolation methods and pure time-series modeling approaches for both MLP and CNN-based learning curves. Additionally, we explore the applicability of our method in Neural Architecture Search scenarios, such as training configuration ranking.

Cite

Text

Ding et al. "Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation." AAAI Conference on Artificial Intelligence, 2025. doi:10.1609/AAAI.V39I15.33789

Markdown

[Ding et al. "Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation." AAAI Conference on Artificial Intelligence, 2025.](https://mlanthology.org/aaai/2025/ding2025aaai-architecture/) doi:10.1609/AAAI.V39I15.33789

BibTeX

@inproceedings{ding2025aaai-architecture,
  title     = {{Architecture-Aware Learning Curve Extrapolation via Graph Ordinary Differential Equation}},
  author    = {Ding, Yanna and Huang, Zijie and Shou, Xiao and Guo, Yihang and Sun, Yizhou and Gao, Jianxi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {16289-16297},
  doi       = {10.1609/AAAI.V39I15.33789},
  url       = {https://mlanthology.org/aaai/2025/ding2025aaai-architecture/}
}