An Information-Theoretic Perspective of Hierarchical Clustering on Graphs
Abstract
The seminal work of \citep{dasgupta2016cost} has introduced a combinatorial cost function for hierarchical graph clustering that has inspired numerous follow-up studies adopting similar combinatorial approaches. In this paper, we investigate this problem from the \emph{information-theoretic} perspective. We formulate a new cost function that is fully explainable and establish the relationship between combinatorial and information-theoretic perspectives. We present two algorithms for expander-like and well-clustered cardinality weighted graphs, respectively, and show that both of them achieve $O(1)$-approximation for our new cost function. Addressing practical needs, we consider non-binary hierarchical clustering problem, and propose a hyperparameter-free framework HCSE that recursively stratifies cluster trees through sparsity-aware partitioning, automatically determining the optimal hierarchy depth via an interpretable mechanism. Extensive experimental results demonstrate the superiority of our cost function and algorithms in binary clustering performance, hierarchy level identification, and reconstruction accuracy compared to existing approaches.
Cite
Text
Pan et al. "An Information-Theoretic Perspective of Hierarchical Clustering on Graphs." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.Markdown
[Pan et al. "An Information-Theoretic Perspective of Hierarchical Clustering on Graphs." Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, 2025.](https://mlanthology.org/uai/2025/pan2025uai-informationtheoretic/)BibTeX
@inproceedings{pan2025uai-informationtheoretic,
title = {{An Information-Theoretic Perspective of Hierarchical Clustering on Graphs}},
author = {Pan, Yicheng and Fan, Bingchen and Long, Pengyu and Zheng, Feng},
booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence},
year = {2025},
pages = {3322-3345},
volume = {286},
url = {https://mlanthology.org/uai/2025/pan2025uai-informationtheoretic/}
}