Towards Understanding Parametric Generalized Category Discovery on Graphs
Abstract
Generalized Category Discovery (GCD) aims to identify both known and novel categories in unlabeled data by leveraging knowledge from old classes. However, existing methods are limited to non-graph data; lack theoretical foundations to answer When and how known classes can help GCD. We introduce the Graph GCD task; provide the first rigorous theoretical analysis of parametric GCD. By quantifying the relationship between old and new classes in the embedding space using the Wasserstein distance W, we derive the first provable GCD loss bound based on W. This analysis highlights two necessary conditions for effective GCD. However, we uncover, through a Pairwise Markov Random Field perspective, that popular graph contrastive learning (GCL) methods inherently violate these conditions. To address this limitation, we propose SWIRL, a novel GCL method for GCD. Experimental results validate our (theoretical) findings and demonstrate SWIRL’s effectiveness.
Cite
Text
Deng et al. "Towards Understanding Parametric Generalized Category Discovery on Graphs." Proceedings of the 42nd International Conference on Machine Learning, 2025.Markdown
[Deng et al. "Towards Understanding Parametric Generalized Category Discovery on Graphs." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/deng2025icml-understanding/)BibTeX
@inproceedings{deng2025icml-understanding,
title = {{Towards Understanding Parametric Generalized Category Discovery on Graphs}},
author = {Deng, Bowen and Fu, Lele and Chen, Jialong and Huang, Sheng and Liao, Tianchi and Tao, Zhang and Chen, Chuan},
booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
year = {2025},
pages = {13069-13109},
volume = {267},
url = {https://mlanthology.org/icml/2025/deng2025icml-understanding/}
}