Improving Hyperparameter Optimization with Checkpointed Model Weights

Mehta, Nikhil; Lorraine, Jonathan; Masson, Steve; Arunachalam, Ramanathan; Bhat, Zaid Pervaiz; Lucas, James; Zachariah, Arun George

doi:10.1007/978-3-031-91979-4_8

Improving Hyperparameter Optimization with Checkpointed Model Weights

Nikhil Mehta, Jonathan Lorraine, Steve Masson, Ramanathan Arunachalam, Zaid Pervaiz Bhat, James Lucas, Arun George Zachariah

ECCVW 2024 pp. 75-96

doi:10.1007/978-3-031-91979-4_8 /eccvw/2024/mehta2024eccvw-improving/

Abstract

As the scale of foundation models continues to grow, efficient hyperparameter optimization (HPO) becomes increasingly critical to manage the substantial computational resources required for training and downstream usage. Traditional HPO methods are often prohibitively expensive in these scenarios, motivating the need for more sophisticated approaches. Classical methods treat this as a black-box optimization problem. However, gray-box HPO methods, which incorporate more information about the setup, have emerged as a promising direction for more efficient optimization. In this work, we propose an HPO method for neural networks using logged checkpoints of trained weights to guide future hyperparameter selections. Our method, Forecasting Model Search (FMS), embeds weights into a Gaussian process deep kernel surrogate model, using a permutation-invariant graph metanetwork to be data-efficient with logged network weights. We open-source our code ( https://github.com/NVlabs/forecasting-model-search ).

PDF ECCVW Semantic Scholar

Cite

Text

Mehta et al. "Improving Hyperparameter Optimization with Checkpointed Model Weights." European Conference on Computer Vision Workshops, 2024. doi:10.1007/978-3-031-91979-4_8

Markdown

[Mehta et al. "Improving Hyperparameter Optimization with Checkpointed Model Weights." European Conference on Computer Vision Workshops, 2024.](https://mlanthology.org/eccvw/2024/mehta2024eccvw-improving/) doi:10.1007/978-3-031-91979-4_8

BibTeX

@inproceedings{mehta2024eccvw-improving,
  title     = {{Improving Hyperparameter Optimization with Checkpointed Model Weights}},
  author    = {Mehta, Nikhil and Lorraine, Jonathan and Masson, Steve and Arunachalam, Ramanathan and Bhat, Zaid Pervaiz and Lucas, James and Zachariah, Arun George},
  booktitle = {European Conference on Computer Vision Workshops},
  year      = {2024},
  pages     = {75-96},
  doi       = {10.1007/978-3-031-91979-4_8},
  url       = {https://mlanthology.org/eccvw/2024/mehta2024eccvw-improving/}
}