Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes

Abstract

Multi-distribution learning is a natural generalization of PAC learning to settings with multiple data distributions. There remains a significant gap between the known upper and lower bounds for PAC-learnable classes. In particular, though we understand the sample complexity of learning a VC dimension $d$ class on $k$ distributions to be $O(\epsilon^{-2} \ln(k) (d + k) + \min \{\epsilon^{-1} d k, \epsilon^{-4} \ln(k) d\})$, the best lower bound is $\Omega(\epsilon^{-2}(d + k \ln(k)))$. We discuss recent progress on this problem and some hurdles that are fundamental to the use of game dynamics in statistical learning.

Cite

Text

Awasthi et al. "Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes." Conference on Learning Theory, 2023.

Markdown

[Awasthi et al. "Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes." Conference on Learning Theory, 2023.](https://mlanthology.org/colt/2023/awasthi2023colt-open/)

BibTeX

@inproceedings{awasthi2023colt-open,
  title     = {{Open Problem: The Sample Complexity of Multi-Distribution Learning for VC Classes}},
  author    = {Awasthi, Pranjal and Haghtalab, Nika and Zhao, Eric},
  booktitle = {Conference on Learning Theory},
  year      = {2023},
  pages     = {5943-5949},
  volume    = {195},
  url       = {https://mlanthology.org/colt/2023/awasthi2023colt-open/}
}