Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

Abstract

We consider a covariate shift problem where one has access to several different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions. The distribution shift is due, in part, to \emph{unobserved} features in the datasets. The objective, then, is to find the best mixture distribution over the training datasets (with only observed features) such that training a learning algorithm using this mixture has the best validation performance. Our proposed algorithm, \textsf{Mix\&Match}, combines stochastic gradient descent (SGD) with optimistic tree search and model re-use (evolving partially trained models with samples from different mixture distributions) over the space of mixtures, for this task. We prove a novel high probability bound on the final SGD iterate without relying on a global gradient norm bound, and use it to show the advantages of model re-use. Additionally, we provide simple regret guarantees for our algorithm with respect to recovering the optimal mixture, given a total budget of SGD evaluations. Finally, we validate our algorithm on two real-world datasets.

Cite

Text

Faw et al. "Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions." Neural Information Processing Systems, 2020.

Markdown

[Faw et al. "Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions." Neural Information Processing Systems, 2020.](https://mlanthology.org/neurips/2020/faw2020neurips-mix/)

BibTeX

@inproceedings{faw2020neurips-mix,
  title     = {{Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions}},
  author    = {Faw, Matthew and Sen, Rajat and Shanmugam, Karthikeyan and Caramanis, Constantine and Shakkottai, Sanjay},
  booktitle = {Neural Information Processing Systems},
  year      = {2020},
  url       = {https://mlanthology.org/neurips/2020/faw2020neurips-mix/}
}