On the Effectiveness of the Skew Divergence for Statistical Language Analysis

Abstract

Estimating word co-occurrence probabilities is a problem underlying many applications in statistical natural language processing. Distance-weighted (or similarityweighted) averaging has been shown to be a promising approach to the analysis of novel co-occurrences. Many measures of distributional similarity have been proposed for use in the distance-weighted averaging framework; here, we empirically study their stability properties, finding that similarity-based estimation appears to make more efficient use of more reliable portions of the training data. We also investigate properties of the skew divergence, a weighted version of the KullbackLeibler (KL) divergence; our results indicate that the skew divergence yields better results than the KL divergence even when the KL divergence is applied to more sophisticated probability estimates.

Cite

Text

Lee. "On the Effectiveness of the Skew Divergence for Statistical Language Analysis." Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, 2001.

Markdown

[Lee. "On the Effectiveness of the Skew Divergence for Statistical Language Analysis." Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics, 2001.](https://mlanthology.org/aistats/2001/lee2001aistats-effectiveness/)

BibTeX

@inproceedings{lee2001aistats-effectiveness,
  title     = {{On the Effectiveness of the Skew Divergence for Statistical Language Analysis}},
  author    = {Lee, Lillian},
  booktitle = {Proceedings of the Eighth International Workshop on Artificial Intelligence and Statistics},
  year      = {2001},
  pages     = {176-183},
  volume    = {R3},
  url       = {https://mlanthology.org/aistats/2001/lee2001aistats-effectiveness/}
}