On the Consistency Rate of Decision Tree Learning Algorithms

Qin-Cheng Zheng, Shen-Huan Lyu, Shao-Qun Zhang, Yuan Jiang, Zhi-Hua Zhou

AISTATS 2023 pp. 7824-7848

/aistats/2023/zheng2023aistats-consistency/

Abstract

Decision tree learning algorithms such as CART are generally based on heuristics that maximizes the purity gain greedily. Though these algorithms are practically successful, theoretical properties such as consistency are far from clear. In this paper, we discover that the most serious obstacle encumbering consistency analysis for decision tree learning algorithms lies in the fact that the worst-case purity gain, i.e., the core heuristics for tree splitting, can be zero. Based on this recognition, we present a new algorithm, named Grid Classification And Regression Tree (GridCART), with a provable consistency rate $\mathcal{O}(n^{-1 / (d + 2)})$, which is the first consistency rate proved for heuristic tree learning algorithms.

PDF AISTATS Semantic Scholar

Cite

Text

Zheng et al. "On the Consistency Rate of Decision Tree Learning Algorithms." Artificial Intelligence and Statistics, 2023.

Markdown

[Zheng et al. "On the Consistency Rate of Decision Tree Learning Algorithms." Artificial Intelligence and Statistics, 2023.](https://mlanthology.org/aistats/2023/zheng2023aistats-consistency/)

BibTeX

@inproceedings{zheng2023aistats-consistency,
  title     = {{On the Consistency Rate of Decision Tree Learning Algorithms}},
  author    = {Zheng, Qin-Cheng and Lyu, Shen-Huan and Zhang, Shao-Qun and Jiang, Yuan and Zhou, Zhi-Hua},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2023},
  pages     = {7824-7848},
  volume    = {206},
  url       = {https://mlanthology.org/aistats/2023/zheng2023aistats-consistency/}
}