A Convergence Analysis of Log-Linear Training

Abstract

Log-linear models are widely used probability models for statistical pattern recognition. Typically, log-linear models are trained according to a convex criterion. In recent years, the interest in log-linear models has greatly increased. The optimization of log-linear model parameters is costly and therefore an important topic, in particular for large-scale applications. Different optimization algorithms have been evaluated empirically in many papers. In this work, we analyze the optimization problem analytically and show that the training of log-linear models can be highly ill-conditioned. We verify our findings on two handwriting tasks. By making use of our convergence analysis, we obtain good results on a large-scale continuous handwriting recognition task with a simple and generic approach.

Cite

Text

Wiesler and Ney. "A Convergence Analysis of Log-Linear Training." Neural Information Processing Systems, 2011.

Markdown

[Wiesler and Ney. "A Convergence Analysis of Log-Linear Training." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/wiesler2011neurips-convergence/)

BibTeX

@inproceedings{wiesler2011neurips-convergence,
  title     = {{A Convergence Analysis of Log-Linear Training}},
  author    = {Wiesler, Simon and Ney, Hermann},
  booktitle = {Neural Information Processing Systems},
  year      = {2011},
  pages     = {657-665},
  url       = {https://mlanthology.org/neurips/2011/wiesler2011neurips-convergence/}
}