Learning Augmented Graph $k$-Clustering
Abstract
Clustering is a fundamental task in unsupervised learning. Previous research has focused on learning-augmented $k$-means in Euclidean metrics, limiting its applicability to complex data representations. In this paper, we generalize learning-augmented $k$-clustering to operate on general metrics, enabling its application to graph-structured and non-Euclidean domains. Our framework also relaxes restrictive cluster size constraints, providing greater flexibility for datasets with imbalanced or unknown cluster distributions. Furthermore, we extend the hardness of query complexity to general metrics: under the Exponential Time Hypothesis (ETH), we show that any polynomial-time algorithm must perform approximately $\Omega(k / \alpha)$ queries to achieve a $(1 + \alpha)$-approximation. These contributions strengthen both the theoretical foundations and practical applicability of learning-augmented clustering, bridging gaps between traditional methods and real-world challenges.
Cite
Text
Fan and Shin. "Learning Augmented Graph $k$-Clustering." Proceedings of Thirty Eighth Conference on Learning Theory, 2025.Markdown
[Fan and Shin. "Learning Augmented Graph $k$-Clustering." Proceedings of Thirty Eighth Conference on Learning Theory, 2025.](https://mlanthology.org/colt/2025/fan2025colt-learning/)BibTeX
@inproceedings{fan2025colt-learning,
title = {{Learning Augmented Graph $k$-Clustering}},
author = {Fan, Chenglin and Shin, Kijun},
booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory},
year = {2025},
pages = {1919-1934},
volume = {291},
url = {https://mlanthology.org/colt/2025/fan2025colt-learning/}
}