Dependent Bigram Identification
Abstract
F23.16> industry 240 times, industry occurs without oil 1001 times, and bigrams other than oil industry occur 1,298,742 times. This distribution is sparse and skewed and thus violates a central assumption implicit in significance testing of contingency tables (Read & Cressie 1988). W 1 W 2 industry :industry totals oil n 11 = 17 n 12 = 240 n1+= 257 :oil n 21 = 1001 n 22 = 1298742 n2+= 1299743 totals n+1=1018 n+2=1298982 n++=1300000 Sensitivity is classically defined as the proportion of true results that agree with the true state. For lexical relationships sensitivity is a conditional probability Copyright c fl1998,
Cite
Text
Pedersen. "Dependent Bigram Identification." AAAI Conference on Artificial Intelligence, 1998.Markdown
[Pedersen. "Dependent Bigram Identification." AAAI Conference on Artificial Intelligence, 1998.](https://mlanthology.org/aaai/1998/pedersen1998aaai-dependent/)BibTeX
@inproceedings{pedersen1998aaai-dependent,
title = {{Dependent Bigram Identification}},
author = {Pedersen, Ted},
booktitle = {AAAI Conference on Artificial Intelligence},
year = {1998},
pages = {1197},
url = {https://mlanthology.org/aaai/1998/pedersen1998aaai-dependent/}
}