Entropy Testing and Its Application to Testing Bayesian Networks

Abstract

This paper studies the problem of \emph{entropy identity testing}: given sample access to a distribution $p$ and a fully described distribution $q$ (both are discrete distributions over the support of size $k$), and the promise that either $p = q$ or $ | H(p) - H(q) | \geqslant \varepsilon$, where $H(\cdot)$ denotes the Shannon entropy, a tester needs to distinguish between the two cases with high probability. We establish a near-optimal sample complexity bound of $\tilde{\Theta}(\sqrt{k}/\varepsilon + {1}/{\varepsilon^2}$) for this problem, and show how to apply it to the problem of identity testing for in-degree-$d$ $n$-dimensional Bayesian networks, obtaining an upper bound of $\tilde{O}( {2^{d / 2} n^{3/2}}/{\varepsilon^2} + {n^2}/{\varepsilon^4} )$. This improves on the sample complexity bound of $\tilde{O}(2^{d/2}n^2/\varepsilon^4)$ from Canonne, Diakonikolas, Kane, and Stewart (2020), which required an additional assumption on the structure of the (unknown) Bayesian network.

Cite

Text

Canonne and Yang. "Entropy Testing and Its Application to Testing Bayesian Networks." Neural Information Processing Systems, 2024. doi:10.52202/079017-4227

Markdown

[Canonne and Yang. "Entropy Testing and Its Application to Testing Bayesian Networks." Neural Information Processing Systems, 2024.](https://mlanthology.org/neurips/2024/canonne2024neurips-entropy/) doi:10.52202/079017-4227

BibTeX

@inproceedings{canonne2024neurips-entropy,
  title     = {{Entropy Testing and Its Application to Testing Bayesian Networks}},
  author    = {Canonne, Clément L. and Yang, Joy Qiping},
  booktitle = {Neural Information Processing Systems},
  year      = {2024},
  doi       = {10.52202/079017-4227},
  url       = {https://mlanthology.org/neurips/2024/canonne2024neurips-entropy/}
}