Toward Data-Centric Directed Graph Learning: An Entropy-Driven Approach

Abstract

Although directed graphs (digraphs) offer strong modeling capabilities for complex topological systems, existing DiGraph Neural Networks (DiGNNs) struggle to fully capture the concealed rich structural information. This data-level limitation results in model-level sub-optimal predictive performance and underscores the necessity of further exploring the potential correlations between the directed edges (topology) and node profiles (features and labels) from a data-centric perspective, thereby empowering model-centric neural networks with stronger encoding capabilities. In this paper, we propose Entropy-driven Digraph knowlEdge distillatioN (EDEN), which can serve as a data-centric digraph learning paradigm or a model-agnostic hot-and-plug data-centric Knowledge Distillation (KD) module. EDEN implements data-centric machine learning by constructing a coarse-grained Hierarchical Knowledge Tree (HKT) using proposed hierarchical encoding theory, and refining HKT through mutual information analysis of node profiles to guide knowledge distillation during training. As a general framework, EDEN naturally extends to undirected graphs and consistently delivers strong performance. Extensive experiments on 14 (di)graph datasets—spanning both homophily and heterophily settings—and across four downstream tasks show that EDEN achieves SOTA results and significantly enhances existing (Di)GNNs.

Cite

Text

Li et al. "Toward Data-Centric Directed Graph Learning: An Entropy-Driven Approach." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Li et al. "Toward Data-Centric Directed Graph Learning: An Entropy-Driven Approach." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/li2025icml-datacentric/)

BibTeX

@inproceedings{li2025icml-datacentric,
  title     = {{Toward Data-Centric Directed Graph Learning: An Entropy-Driven Approach}},
  author    = {Li, Xunkai and Wu, Zhengyu and Yu, Kaichi and Qin, Hongchao and Zeng, Guang and Li, Rong-Hua and Wang, Guoren},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {36310-36339},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/li2025icml-datacentric/}
}