Domain-Specific Keyphrase Extraction

Abstract

Document keyphrases provide semantic metadata characterizing documents and producing an overview of the content of a document. They can be used in many text-mining and knowledge management related applications. This paper describes a Keyphrase Identification Program (KIP), which extracts document keyphrases by using prior positive samples of human identified domain keyphrases to assign weights to the candidate keyphrases. The logic of our algorithm is: the more keywords a candidate keyphrase contains and the more significant these keywords are, the more likely this candidate phrase is a keyphrase. To obtain prior positive inputs, KIP first populates its glossary database using manually identified keyphrases and keywords. It then checks the composition of all noun phrases of a document, looks up the database and calculates scores for all these noun phrases. The ones having higher scores will be extracted as keyphrases.

Cite

Text

Frank et al. "Domain-Specific Keyphrase Extraction." International Joint Conference on Artificial Intelligence, 1999. doi:10.1145/1099554.1099628

Markdown

[Frank et al. "Domain-Specific Keyphrase Extraction." International Joint Conference on Artificial Intelligence, 1999.](https://mlanthology.org/ijcai/1999/frank1999ijcai-domain/) doi:10.1145/1099554.1099628

BibTeX

@inproceedings{frank1999ijcai-domain,
  title     = {{Domain-Specific Keyphrase Extraction}},
  author    = {Frank, Eibe and Paynter, Gordon W. and Witten, Ian H. and Gutwin, Carl and Nevill-Manning, Craig G.},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {1999},
  pages     = {668-673},
  doi       = {10.1145/1099554.1099628},
  url       = {https://mlanthology.org/ijcai/1999/frank1999ijcai-domain/}
}