Scalable Optimal Multiway-Split Decision Trees with Constraints
Abstract
There has been a surge of interest in learning optimal decision trees using mixed-integer programs (MIP) in recent years, as heuristic-based methods do not guarantee optimality and find it challenging to incorporate constraints that are critical for many practical applications. However, existing MIP methods that build on an arc-based formulation do not scale well as the number of binary variables is in the order of 2 to the power of the depth of the tree and the size of the dataset. Moreover, they can only handle sample-level constraints and linear metrics. In this paper, we propose a novel path-based MIP formulation where the number of decision variables is independent of dataset size. We present a scalable column generation framework to solve the MIP. Our framework produces a multiway-split tree which is more interpretable than the typical binary-split trees due to its shorter rules. Our framework is more general as it can handle nonlinear metrics such as F1 score, and incorporate a broader class of constraints. We demonstrate its efficacy with extensive experiments. We present results on datasets containing up to 1,008,372 samples while existing MIP-based decision tree models do not scale well on data beyond a few thousand points. We report superior or competitive results compared to the state-of-art MIP-based methods with up to a 24X reduction in runtime.
Cite
Text
Subramanian and Sun. "Scalable Optimal Multiway-Split Decision Trees with Constraints." AAAI Conference on Artificial Intelligence, 2023. doi:10.1609/AAAI.V37I8.26180Markdown
[Subramanian and Sun. "Scalable Optimal Multiway-Split Decision Trees with Constraints." AAAI Conference on Artificial Intelligence, 2023.](https://mlanthology.org/aaai/2023/subramanian2023aaai-scalable/) doi:10.1609/AAAI.V37I8.26180BibTeX
@inproceedings{subramanian2023aaai-scalable,
title = {{Scalable Optimal Multiway-Split Decision Trees with Constraints}},
author = {Subramanian, Shivaram and Sun, Wei},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2023},
pages = {9891-9899},
doi = {10.1609/AAAI.V37I8.26180},
url = {https://mlanthology.org/aaai/2023/subramanian2023aaai-scalable/}
}