Optimal Multi-Distribution Learning

Abstract

Multi-distribution learning (MDL), which seeks to learn a shared model that minimizes the worst-case risk across $k$ distinct data distributions, has emerged as a unified framework in response to the evolving demand for robustness, fairness, multi-group collaboration, etc. Achieving data-efficient MDL necessitates adaptive sampling, also called on-demand sampling, throughout the learning process. However, there exist substantial gaps between the state-of-the-art upper and lower bounds on the optimal sample complexity. Focusing on a hypothesis class of Vapnik-Chervonenkis (VC) dimension $d$, we propose a novel algorithm that yields an $\varepsilon$-optimal randomized hypothesis with a sample complexity on the order of $\frac{d+k}{\varepsilon^2}$ (modulo some logarithmic factor), matching the best-known lower bound. Our algorithmic ideas and theory have been further extended to accommodate Rademacher classes. The proposed algorithms are oracle-efficient, which access the hypothesis class solely through an empirical risk minimization oracle. Additionally, we establish the necessity of randomization, unveiling a large sample size barrier when only deterministic hypotheses are permitted. These findings successfully resolve three open problems presented in COLT 2023 (i.e., Problems 1, 3 and 4 of Awasthi et al. 2023).

Cite

Text

Zhang et al. "Optimal Multi-Distribution Learning." Conference on Learning Theory, 2024.

Markdown

[Zhang et al. "Optimal Multi-Distribution Learning." Conference on Learning Theory, 2024.](https://mlanthology.org/colt/2024/zhang2024colt-optimal/)

BibTeX

@inproceedings{zhang2024colt-optimal,
  title     = {{Optimal Multi-Distribution Learning}},
  author    = {Zhang, Zihan and Zhan, Wenhao and Chen, Yuxin and Du, Simon S and Lee, Jason D},
  booktitle = {Conference on Learning Theory},
  year      = {2024},
  pages     = {5220-5223},
  volume    = {247},
  url       = {https://mlanthology.org/colt/2024/zhang2024colt-optimal/}
}