Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers

Abstract

Direct image-to-graph transformation is a challenging task solving object detection and relationship prediction in a single model. Due to this task's complexity large training datasets are rare in many domains making the training of deep-learning methods challenging. This data sparsity necessitates transfer learning strategies akin to the state-of-the-art in general computer vision. In this work we introduce a set of methods enabling cross-domain and cross-dimension learning for image-to-graph transformers. We propose (1) a regularized edge sampling loss to effectively learn object relations in multiple domains with different numbers of edges (2) a domain adaptation framework for image-to-graph transformers aligning image- and graph-level features from different domains and (3) a projection function that allows using 2D data for training 3D transformers. We demonstrate our method's utility in cross-domain and cross-dimension experiments where we utilize labeled data from 2D road networks for simultaneous learning in vastly different target domains. Our method consistently outperforms standard transfer learning and self-supervised pretraining on challenging benchmarks such as retinal or whole-brain vessel graph extraction.

Cite

Text

Berger et al. "Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers." Winter Conference on Applications of Computer Vision, 2025.

Markdown

[Berger et al. "Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers." Winter Conference on Applications of Computer Vision, 2025.](https://mlanthology.org/wacv/2025/berger2025wacv-crossdomain/)

BibTeX

@inproceedings{berger2025wacv-crossdomain,
  title     = {{Cross-Domain and Cross-Dimension Learning for Image-to-Graph Transformers}},
  author    = {Berger, Alexander H. and Lux, Laurin and Shit, Suprosanna and Ezhof, Ivan and Kaissis, Georgios and Menten, Martin J. and Rueckert, Daniel and Paetzold, Johannes C.},
  booktitle = {Winter Conference on Applications of Computer Vision},
  year      = {2025},
  pages     = {64-74},
  url       = {https://mlanthology.org/wacv/2025/berger2025wacv-crossdomain/}
}