The Optimal PAC Algorithm
Abstract
Assume we are trying to learn a concept class C of VC dimension d with respect to an arbitrary distribution. There is PAC sample size bound that holds for any algorithm that always predicts with some consistent concept in the class C (BEHW89): $O(\frac{1}{\epsilon}(dlog\frac{1}{\epsilon}+log\frac{1}{\epsilon}))$ , where ε and δ are the accuracy and confidence parameters. Thus after drawing this many examples (consistent with any concept in C ), then with probability at least 1– δ , the error of the produced concept is at most ε . Here the examples are drawn with respect to an arbitrary but fixed distribution D , and the accuracy is measured with respect to the same distribution.
Cite
Text
Warmuth. "The Optimal PAC Algorithm." Annual Conference on Computational Learning Theory, 2004. doi:10.1007/978-3-540-27819-1_45Markdown
[Warmuth. "The Optimal PAC Algorithm." Annual Conference on Computational Learning Theory, 2004.](https://mlanthology.org/colt/2004/warmuth2004colt-optimal/) doi:10.1007/978-3-540-27819-1_45BibTeX
@inproceedings{warmuth2004colt-optimal,
title = {{The Optimal PAC Algorithm}},
author = {Warmuth, Manfred K.},
booktitle = {Annual Conference on Computational Learning Theory},
year = {2004},
pages = {641-642},
doi = {10.1007/978-3-540-27819-1_45},
url = {https://mlanthology.org/colt/2004/warmuth2004colt-optimal/}
}