Formalizing Hierarchical Clustering as Integer Linear Programming
Abstract
Hierarchical clustering is typically implemented as a greedy heuristic algorithm with no explicit objective function. In this work we formalize hierarchical clustering as an integer linear programming (ILP) problem with a natural objective function and the dendrogram properties enforced as linear constraints. Though exact solvers exists for ILP we show that a simple randomized algorithm and a linear programming (LP) relaxation can be used to provide approximate solutions faster. Formalizing hierarchical clustering also has the benefit that relaxing the constraints can produce novel problem variations such as overlapping clusterings. Our experiments show that our formulation is capable of outperforming standard agglomerative clustering algorithms in a variety of settings, including traditional hierarchical clustering as well as learning overlapping clusterings.
Cite
Text
Gilpin et al. "Formalizing Hierarchical Clustering as Integer Linear Programming." AAAI Conference on Artificial Intelligence, 2013. doi:10.1609/AAAI.V27I1.8671Markdown
[Gilpin et al. "Formalizing Hierarchical Clustering as Integer Linear Programming." AAAI Conference on Artificial Intelligence, 2013.](https://mlanthology.org/aaai/2013/gilpin2013aaai-formalizing/) doi:10.1609/AAAI.V27I1.8671BibTeX
@inproceedings{gilpin2013aaai-formalizing,
title = {{Formalizing Hierarchical Clustering as Integer Linear Programming}},
author = {Gilpin, Sean and Nijssen, Siegfried and Davidson, Ian N.},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {2013},
pages = {372-378},
doi = {10.1609/AAAI.V27I1.8671},
url = {https://mlanthology.org/aaai/2013/gilpin2013aaai-formalizing/}
}