Learning Tree Structures from Noisy Data

Abstract

We provide high-probability sample complexity guarantees for exact structure recovery of tree-structured graphical models, when only noisy observations of the respective vertex emissions are available. We assume that the hidden variables follow either an Ising model or a Gaussian graphical model, and the observables are noise-corrupted versions of the hidden variables: We consider multiplicative $\pm 1$ binary noise for Ising models, and additive Gaussian noise for Gaussian models. Such hidden models arise naturally in a variety of applications such as physics, biology, computer science, and finance. We study the impact of measurement noise on the task of learning the underlying tree structure via the well-known \textit{Chow-Liu algorithm} and provide formal sample complexity guarantees for exact recovery. In particular, for a tree with $p$ vertices and probability of failure $\delta>0$, we show that the number of necessary samples for exact structure recovery is of the order of $\mc{O}(\log(p/\delta))$ for Ising models (which remains the \textit{same as in the noiseless case}), and $\mc{O}(\mathrm{polylog}{(p/\delta)})$ for Gaussian models.

Cite

Text

Nikolakakis et al. "Learning Tree Structures from Noisy Data." Artificial Intelligence and Statistics, 2019.

Markdown

[Nikolakakis et al. "Learning Tree Structures from Noisy Data." Artificial Intelligence and Statistics, 2019.](https://mlanthology.org/aistats/2019/nikolakakis2019aistats-learning/)

BibTeX

@inproceedings{nikolakakis2019aistats-learning,
  title     = {{Learning Tree Structures from Noisy Data}},
  author    = {Nikolakakis, Konstantinos E. and Kalogerias, Dionysios S. and Sarwate, Anand D.},
  booktitle = {Artificial Intelligence and Statistics},
  year      = {2019},
  pages     = {1771-1782},
  volume    = {89},
  url       = {https://mlanthology.org/aistats/2019/nikolakakis2019aistats-learning/}
}