Multi-Level Boundary Classification for Information Extraction

Abstract

We investigate the application of classification techniques to the problem of information extraction (IE). In particular we use support vector machines and several different feature-sets to build a set of classifiers for IE. We show that this approach is competitive with current state-of-the-art IE algorithms based on specialized learning algorithms. We also introduce a new technique for improving the recall of our IE algorithm. This approach uses a two-level ensemble of classifiers to improve the recall of the extracted fragments while maintaining high precision. We show that this approach outperforms current state-of-the-art IE algorithms on several benchmark IE tasks.

Cite

Text

Finn and Kushmerick. "Multi-Level Boundary Classification for Information Extraction." European Conference on Machine Learning, 2004. doi:10.1007/978-3-540-30115-8_13

Markdown

[Finn and Kushmerick. "Multi-Level Boundary Classification for Information Extraction." European Conference on Machine Learning, 2004.](https://mlanthology.org/ecmlpkdd/2004/finn2004ecml-multilevel/) doi:10.1007/978-3-540-30115-8_13

BibTeX

@inproceedings{finn2004ecml-multilevel,
  title     = {{Multi-Level Boundary Classification for Information Extraction}},
  author    = {Finn, Aidan and Kushmerick, Nicholas},
  booktitle = {European Conference on Machine Learning},
  year      = {2004},
  pages     = {111-122},
  doi       = {10.1007/978-3-540-30115-8_13},
  url       = {https://mlanthology.org/ecmlpkdd/2004/finn2004ecml-multilevel/}
}