Dependent Bigram Identification

Abstract

F23.16> industry 240 times, industry occurs without oil 1001 times, and bigrams other than oil industry occur 1,298,742 times. This distribution is sparse and skewed and thus violates a central assumption implicit in significance testing of contingency tables (Read & Cressie 1988). W 1 W 2 industry :industry totals oil n 11 = 17 n 12 = 240 n1+= 257 :oil n 21 = 1001 n 22 = 1298742 n2+= 1299743 totals n+1=1018 n+2=1298982 n++=1300000 Sensitivity is classically defined as the proportion of true results that agree with the true state. For lexical relationships sensitivity is a conditional probability Copyright c fl1998,

Cite

Text

Pedersen. "Dependent Bigram Identification." AAAI Conference on Artificial Intelligence, 1998.

Markdown

[Pedersen. "Dependent Bigram Identification." AAAI Conference on Artificial Intelligence, 1998.](https://mlanthology.org/aaai/1998/pedersen1998aaai-dependent/)

BibTeX

@inproceedings{pedersen1998aaai-dependent,
  title     = {{Dependent Bigram Identification}},
  author    = {Pedersen, Ted},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {1998},
  pages     = {1197},
  url       = {https://mlanthology.org/aaai/1998/pedersen1998aaai-dependent/}
}