Gradient Scarcity in Graph Learning with Bilevel Optimization

Abstract

Gradient scarcity emerges when learning graphs by minimizing a loss on a subset of nodes under the semi-supervised setting. It consists in edges between unlabeled nodes that are far from the labeled ones receiving zero gradients. The phenomenon was first described when jointly optimizing the graph and the parameters of a shallow Graph Neural Network (GNN) using a single loss function. In this work, we give a precise mathematical characterization of this phenomenon, and prove that it also emerges in bilevel optimization. While for GNNs gradient scarcity occurs due to their finite receptive field, we show that it also occurs with the Laplacian regularization as gradients decrease exponentially in amplitude with distance to labeled nodes, despite the infinite receptive field of this model. We study several solutions to this issue including latent graph learning using a Graph-to-Graph model (G2G), graph regularization to impose a prior structure on the graph, and reducing the graph diameter by optimizing for a larger set of edges. Our empirical results validate our analysis and show that this issue also occurs with the Approximate Personalized Propagation of Neural Predictions (APPNP), which approximates a model of infinite receptive field.

Cite

Text

Ghanem et al. "Gradient Scarcity in Graph Learning with Bilevel Optimization." Transactions on Machine Learning Research, 2024.

Markdown

[Ghanem et al. "Gradient Scarcity in Graph Learning with Bilevel Optimization." Transactions on Machine Learning Research, 2024.](https://mlanthology.org/tmlr/2024/ghanem2024tmlr-gradient/)

BibTeX

@article{ghanem2024tmlr-gradient,
  title     = {{Gradient Scarcity in Graph Learning with Bilevel Optimization}},
  author    = {Ghanem, Hashem and Vaiter, Samuel and Keriven, Nicolas},
  journal   = {Transactions on Machine Learning Research},
  year      = {2024},
  url       = {https://mlanthology.org/tmlr/2024/ghanem2024tmlr-gradient/}
}