Semi-Supervised Abstraction-Augmented String Kernel for Multi-Level Bio-Relation Extraction
Abstract
Bio-relation extraction (bRE), an important goal in bio-text mining, involves subtasks identifying relationships between bio-entities in text at multiple levels, e.g., at the article, sentence or relation level. A key limitation of current bRE systems is that they are restricted by the availability of annotated corpora. In this work we introduce a semi-supervised approach that can tackle multi-level bRE via string comparisons with mismatches in the string kernel framework. Our string kernel implements an abstraction step, which groups similar words to generate more abstract entities, which can be learnt with unlabeled data. Specifically, two unsupervised models are proposed to capture contextual (local or global) semantic similarities between words from a large unannotated corpus. This Abstraction-augmented String Kernel (ASK) allows for better generalization of patterns learned from annotated data and provides a unified framework for solving bRE with multiple degrees of detail. ASK shows effective improvements over classic string kernels on four datasets and achieves state-of-the-art bRE performance without the need for complex linguistic features.
Cite
Text
Kuksa et al. "Semi-Supervised Abstraction-Augmented String Kernel for Multi-Level Bio-Relation Extraction." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010. doi:10.1007/978-3-642-15883-4_9Markdown
[Kuksa et al. "Semi-Supervised Abstraction-Augmented String Kernel for Multi-Level Bio-Relation Extraction." European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2010.](https://mlanthology.org/ecmlpkdd/2010/kuksa2010ecmlpkdd-semisupervised/) doi:10.1007/978-3-642-15883-4_9BibTeX
@inproceedings{kuksa2010ecmlpkdd-semisupervised,
title = {{Semi-Supervised Abstraction-Augmented String Kernel for Multi-Level Bio-Relation Extraction}},
author = {Kuksa, Pavel P. and Qi, Yanjun and Bai, Bing and Collobert, Ronan and Weston, Jason and Pavlovic, Vladimir and Ning, Xia},
booktitle = {European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases},
year = {2010},
pages = {128-144},
doi = {10.1007/978-3-642-15883-4_9},
url = {https://mlanthology.org/ecmlpkdd/2010/kuksa2010ecmlpkdd-semisupervised/}
}