Learning Deterministic Weighted Automata with Queries and Counterexamples

Abstract

We present an algorithm for reconstruction of a probabilistic deterministic finite automaton (PDFA) from a given black-box language model, such as a recurrent neural network (RNN). The algorithm is a variant of the exact-learning algorithm L*, adapted to work in a probabilistic setting under noise. The key insight of the adaptation is the use of conditional probabilities when making observations on the model, and the introduction of a variation tolerance when comparing observations. When applied to RNNs, our algorithm returns models with better or equal word error rate (WER) and normalised distributed cumulative gain (NDCG) than achieved by n-gram or weighted finite automata (WFA) approximations of the same networks. The PDFAs capture a richer class of languages than n-grams, and are guaranteed to be stochastic and deterministic -- unlike the WFAs.

Cite

Text

Weiss et al. "Learning Deterministic Weighted Automata with Queries and Counterexamples." Neural Information Processing Systems, 2019.

Markdown

[Weiss et al. "Learning Deterministic Weighted Automata with Queries and Counterexamples." Neural Information Processing Systems, 2019.](https://mlanthology.org/neurips/2019/weiss2019neurips-learning/)

BibTeX

@inproceedings{weiss2019neurips-learning,
  title     = {{Learning Deterministic Weighted Automata with Queries and Counterexamples}},
  author    = {Weiss, Gail and Goldberg, Yoav and Yahav, Eran},
  booktitle = {Neural Information Processing Systems},
  year      = {2019},
  pages     = {8560-8571},
  url       = {https://mlanthology.org/neurips/2019/weiss2019neurips-learning/}
}