A Statistical Method for Handling Unknown Words

Franz, Alexander

A Statistical Method for Handling Unknown Words

AAAI 1994 pp. 1447

/aaai/1994/franz1994aaai-statistical/

Abstract

Robust Natural Language Processing systems must be able to handle words that are not in their lexicon. We created a classifier that was trained on tagged text to find the most likely parts of speech for unknown words. The classifier uses a contingency table to count the observed features, and a loglinear model to smooth the cell counts. After smoothing, the contingency table is used to obtain the conditional probability distribution for classification. A number of features, determined by exploration (Tukey 1977), are used. For example, is the word capitalized? Does the word carry one of a number of known suffixes? We maximize the conditional probability of the proposed classification given the features to achieve minimum error rate classification (Duda & Hart 1973). The baseline results are provided by using only the prior probabilities P(c) (column Prior). (Weischedel et al. 1993) describe a probabilistic model with four features that are treated as independent, which we reimplemented (column 4 Indep). For comparison, we created a statistical classifier with the same four features (column 4 Class). Our best model was a classifier with nine features (column 9 Class).

PDF AAAI Semantic Scholar

Cite

Text

Franz. "A Statistical Method for Handling Unknown Words." AAAI Conference on Artificial Intelligence, 1994.

Markdown

[Franz. "A Statistical Method for Handling Unknown Words." AAAI Conference on Artificial Intelligence, 1994.](https://mlanthology.org/aaai/1994/franz1994aaai-statistical/)

BibTeX

@inproceedings{franz1994aaai-statistical,
  title     = {{A Statistical Method for Handling Unknown Words}},
  author    = {Franz, Alexander},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1994},
  pages     = {1447},
  url       = {https://mlanthology.org/aaai/1994/franz1994aaai-statistical/}
}