Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning

Abstract

Bayesian optimization is popular for optimizing time-consuming black-box objectives. Nonetheless, for hyperparameter tuning in deep neural networks, the time required to evaluate the validation error for even a few hyperparameter settings remains a bottleneck. Multi-fidelity optimization promises relief using cheaper proxies to such objectives — for example, validation error for a network trained using a subset of the training points or fewer iterations than required for convergence. We propose a highly flexible and practical approach to multi-fidelity Bayesian optimization, focused on efficiently optimizing hyperparameters for iteratively trained supervised learning models. We introduce a new acquisition function, the trace-aware knowledge-gradient, which efficiently leverages both multiple continuous fidelity controls and trace observations — values of the objective at a sequence of fidelities, available when varying fidelity using training iterations. We provide a provably convergent method for optimizing our acquisition function and show it outperforms state-of-the-art alternatives for hyperparameter tuning of deep neural networks and large-scale kernel learning.

Cite

Text

Wu et al. "Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning." Uncertainty in Artificial Intelligence, 2019.

Markdown

[Wu et al. "Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning." Uncertainty in Artificial Intelligence, 2019.](https://mlanthology.org/uai/2019/wu2019uai-practical/)

BibTeX

@inproceedings{wu2019uai-practical,
  title     = {{Practical Multi-Fidelity Bayesian Optimization for Hyperparameter Tuning}},
  author    = {Wu, Jian and Toscano-Palmerin, Saul and Frazier, Peter I. and Wilson, Andrew Gordon},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2019},
  pages     = {788-798},
  volume    = {115},
  url       = {https://mlanthology.org/uai/2019/wu2019uai-practical/}
}