Optimal Decision Trees for Interpretable and Constrained Clustering

Abstract

Constrained clustering is a semi-supervised approach to determining meaningful groupings of data that respect userspecified constraints. Such constraints are typically used to enforce desirable structural and domain-specific properties of the resulting clusters. Notably, such constraints can significantly improve the quality and accuracy of clustering. Data clustering solutions can take on many different forms. Decision trees are a particularly desirable solution form because of their inherent interpretability. Unfortunately, existing decision tree clustering approaches do not support clustering constraints and do not provide strong theoretical guarantees with respect to solution quality. To address the task of decision tree clustering with constraints, we present a novel SAT-based encoding that solves the problem to an approximated optimality in relation to a well-known bi-criteria objective. Our framework is the first exact approach for interpretable constrained clustering with decision trees. Experiments involving a range of real-world and synthetic datasets demonstrate that our approach can produce interpretable clustering solutions that are of superior quality compared to their non-interpretable counterparts, with or without the addition of constraints. We further provide new insights into the trade-off between interpretability and the satisfaction of user-specified constraints, presenting extensions to our clustering approach that treat the satisfaction of constraints as an additional optimization objective.

Cite

Text

Shati et al. "Optimal Decision Trees for Interpretable and Constrained Clustering." Journal of Artificial Intelligence Research, 2025. doi:10.1613/JAIR.1.18144

Markdown

[Shati et al. "Optimal Decision Trees for Interpretable and Constrained Clustering." Journal of Artificial Intelligence Research, 2025.](https://mlanthology.org/jair/2025/shati2025jair-optimal/) doi:10.1613/JAIR.1.18144

BibTeX

@article{shati2025jair-optimal,
  title     = {{Optimal Decision Trees for Interpretable and Constrained Clustering}},
  author    = {Shati, Pouya and Song, Yuliang and Cohen, Eldan and McIlraith, Sheila A.},
  journal   = {Journal of Artificial Intelligence Research},
  year      = {2025},
  doi       = {10.1613/JAIR.1.18144},
  volume    = {84},
  url       = {https://mlanthology.org/jair/2025/shati2025jair-optimal/}
}