SAT-Based Decision Tree Learning for Large Data Sets

Schidler, André; Szeider, Stefan

doi:10.1613/JAIR.1.15956

SAT-Based Decision Tree Learning for Large Data Sets

André Schidler, Stefan Szeider

JAIR 2024 pp. 875-918

doi:10.1613/JAIR.1.15956 /jair/2024/schidler2024jair-satbased/

Abstract

Decision trees of low depth are beneficial for understanding and interpreting the data they represent. Unfortunately, finding a decision tree of lowest complexity (depth or size) that correctly represents given data is NP-hard. Hence known algorithms either (i) utilize heuristics that do not minimize the depth or (ii) are exact but scale only to small or medium-sized instances. We propose a new hybrid approach to decision tree learning, combining heuristic and exact methods in a novel way. More specifically, we employ SAT encodings repeatedly to local parts of a decision tree provided by a standard heuristic, leading to an overall reduction in complexity. This allows us to scale the power of exact SAT-based methods to comparatively very large data sets. We evaluate our new approach experimentally on a range of real-world instances that contain up to several thousand samples. In almost all cases, our method successfully decreases the complexity of the initial decision tree; often, the decrease is significant.

PDF JAIR Semantic Scholar

Cite

Text

Schidler and Szeider. "SAT-Based Decision Tree Learning for Large Data Sets." Journal of Artificial Intelligence Research, 2024. doi:10.1613/JAIR.1.15956

Markdown

[Schidler and Szeider. "SAT-Based Decision Tree Learning for Large Data Sets." Journal of Artificial Intelligence Research, 2024.](https://mlanthology.org/jair/2024/schidler2024jair-satbased/) doi:10.1613/JAIR.1.15956

BibTeX

@article{schidler2024jair-satbased,
  title     = {{SAT-Based Decision Tree Learning for Large Data Sets}},
  author    = {Schidler, André and Szeider, Stefan},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2024},
  pages     = {875-918},
  doi       = {10.1613/JAIR.1.15956},
  volume    = {80},
  url       = {https://mlanthology.org/jair/2024/schidler2024jair-satbased/}
}