Provable Guarantees for Decision Tree Induction: The Agnostic Setting

Abstract

We give strengthened provable guarantees on the performance of widely employed and empirically successful {\sl top-down decision tree learning heuristics}. While prior works have focused on the realizable setting, we consider the more realistic and challenging {\sl agnostic} setting. We show that for all monotone functions $f$ and $s\in \mathbb{N}$, these heuristics construct a decision tree of size $s^{\tilde{O}((\log s)/\varepsilon^2)}$ that achieves error $\le \mathsf{opt}_s + \varepsilon$, where $\mathsf{opt}_s$ denotes the error of the optimal size-$s$ decision tree for $f$. Previously such a guarantee was not known to be achievable by any algorithm, even one that is not based on top-down heuristics. We complement our algorithmic guarantee with a near-matching $s^{\tilde{\Omega}(\log s)}$ lower bound.

Cite

Text

Blanc et al. "Provable Guarantees for Decision Tree Induction: The Agnostic Setting." International Conference on Machine Learning, 2020.

Markdown

[Blanc et al. "Provable Guarantees for Decision Tree Induction: The Agnostic Setting." International Conference on Machine Learning, 2020.](https://mlanthology.org/icml/2020/blanc2020icml-provable/)

BibTeX

@inproceedings{blanc2020icml-provable,
  title     = {{Provable Guarantees for Decision Tree Induction: The Agnostic Setting}},
  author    = {Blanc, Guy and Lange, Jane and Tan, Li-Yang},
  booktitle = {International Conference on Machine Learning},
  year      = {2020},
  pages     = {941-949},
  volume    = {119},
  url       = {https://mlanthology.org/icml/2020/blanc2020icml-provable/}
}