Collaborative Non-Parametric Two-Sample Testing

Abstract

Multiple two-sample test problem in a graph-structured setting is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions, $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. CTST integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.

Cite

Text

De Concha Duarte et al. "Collaborative Non-Parametric Two-Sample Testing." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.

Markdown

[De Concha Duarte et al. "Collaborative Non-Parametric Two-Sample Testing." Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, 2025.](https://mlanthology.org/aistats/2025/conchaduarte2025aistats-collaborative/)

BibTeX

@inproceedings{conchaduarte2025aistats-collaborative,
  title     = {{Collaborative Non-Parametric Two-Sample Testing}},
  author    = {De Concha Duarte, Alejandro David and Vayatis, Nicolas and Kalogeratos, Argyris},
  booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  year      = {2025},
  pages     = {838-846},
  volume    = {258},
  url       = {https://mlanthology.org/aistats/2025/conchaduarte2025aistats-collaborative/}
}