Multi-Task Learning Curve Forecasting Across Hyperparameter Configurations and Datasets

Abstract

The computational challenges arising from increasingly large search spaces in hyperparameter optimization necessitate the use of performance prediction methods. Previous works have shown that approximated performances at various levels of fidelities can efficiently early terminate sub-optimal model configurations. In this paper, we design a Sequence-to-sequence learning curve forecasting method paired with a novel objective formulation that takes into account earliness, multi-horizon and multi-target aspects. This formulation explicitly optimizes for forecasting shorter learning curves to distant horizons and regularizes the predictions with auxiliary forecasting of multiple targets like gradient statistics that are additionally collected over time. Furthermore, via embedding meta-knowledge, the model exploits latent correlations among source dataset representations and configuration trajectories which generalizes to accurately forecasting partially observed learning curves from unseen target datasets and configurations. We experimentally validate the superiority of the method to learning curve forecasting baselines and several ablations to the objective function formulation. Additional experiments showcase accelerated hyperparameter optimization culminating in near-optimal model performance.

Cite

Text

Jawed et al. "Multi-Task Learning Curve Forecasting Across Hyperparameter Configurations and Datasets." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021. doi:10.1007/978-3-030-86486-6_30

Markdown

[Jawed et al. "Multi-Task Learning Curve Forecasting Across Hyperparameter Configurations and Datasets." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2021.](https://mlanthology.org/ecmlpkdd/2021/jawed2021ecmlpkdd-multitask/) doi:10.1007/978-3-030-86486-6_30

BibTeX

@inproceedings{jawed2021ecmlpkdd-multitask,
  title     = {{Multi-Task Learning Curve Forecasting Across Hyperparameter Configurations and Datasets}},
  author    = {Jawed, Shayan and Jomaa, Hadi S. and Schmidt-Thieme, Lars and Grabocka, Josif},
  booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
  year      = {2021},
  pages     = {485-501},
  doi       = {10.1007/978-3-030-86486-6_30},
  url       = {https://mlanthology.org/ecmlpkdd/2021/jawed2021ecmlpkdd-multitask/}
}