A Kernel Stein Test of Goodness of Fit for Sequential Models

Abstract

We propose a goodness-of-fit measure for probability densities modeling observations with varying dimensionality, such as text documents of differing lengths or variable-length sequences. The proposed measure is an instance of the kernel Stein discrepancy (KSD), which has been used to construct goodness-of-fit tests for unnormalized densities. The KSD is defined by its Stein operator: current operators used in testing apply to fixed-dimensional spaces. As our main contribution, we extend the KSD to the variable-dimension setting by identifying appropriate Stein operators, and propose a novel KSD goodness-of-fit test. As with the previous variants, the proposed KSD does not require the density to be normalized, allowing the evaluation of a large class of models. Our test is shown to perform well in practice on discrete sequential data benchmarks.

Cite

Text

Baum et al. "A Kernel Stein Test of Goodness of Fit for Sequential Models." International Conference on Machine Learning, 2023.

Markdown

[Baum et al. "A Kernel Stein Test of Goodness of Fit for Sequential Models." International Conference on Machine Learning, 2023.](https://mlanthology.org/icml/2023/baum2023icml-kernel/)

BibTeX

@inproceedings{baum2023icml-kernel,
  title     = {{A Kernel Stein Test of Goodness of Fit for Sequential Models}},
  author    = {Baum, Jerome and Kanagawa, Heishiro and Gretton, Arthur},
  booktitle = {International Conference on Machine Learning},
  year      = {2023},
  pages     = {1936-1953},
  volume    = {202},
  url       = {https://mlanthology.org/icml/2023/baum2023icml-kernel/}
}