Unsupervised Spectral Learning of Finite State Transducers

Abstract

Finite-State Transducers (FST) are a standard tool for modeling paired input-output sequences and are used in numerous applications, ranging from computational biology to natural language processing. Recently Balle et al. presented a spectral algorithm for learning FST from samples of aligned input-output sequences. In this paper we address the more realistic, yet challenging setting where the alignments are unknown to the learning algorithm. We frame FST learning as finding a low rank Hankel matrix satisfying constraints derived from observable statistics. Under this formulation, we provide identifiability results for FST distributions. Then, following previous work on rank minimization, we propose a regularized convex relaxation of this objective which is based on minimizing a nuclear norm penalty subject to linear constraints and can be solved efficiently.

Cite

Text

Bailly et al. "Unsupervised Spectral Learning of Finite State Transducers." Neural Information Processing Systems, 2013.

Markdown

[Bailly et al. "Unsupervised Spectral Learning of Finite State Transducers." Neural Information Processing Systems, 2013.](https://mlanthology.org/neurips/2013/bailly2013neurips-unsupervised/)

BibTeX

@inproceedings{bailly2013neurips-unsupervised,
  title     = {{Unsupervised Spectral Learning of Finite State Transducers}},
  author    = {Bailly, Raphael and Carreras, Xavier and Quattoni, Ariadna},
  booktitle = {Neural Information Processing Systems},
  year      = {2013},
  pages     = {800-808},
  url       = {https://mlanthology.org/neurips/2013/bailly2013neurips-unsupervised/}
}