Wrapper Generation via Grammar Induction
Abstract
To facilitate effective search on the World Wide Web, meta search engines have been developed which do not search the Web themselves, but use available search engines to find the required information. By means of wrappers, meta search engines retrieve information from the pages returned by search engines. We present an approach to automatically create such wrappers by means of an incremental grammar induction algorithm. The algorithm uses an adaptation of the string edit distance. Our method performs well; it is quick, can be used for several types of result pages and requires a minimal amount of user interaction.
Cite
Text
Chidlovskii et al. "Wrapper Generation via Grammar Induction." European Conference on Machine Learning, 2000. doi:10.1007/3-540-45164-1_11Markdown
[Chidlovskii et al. "Wrapper Generation via Grammar Induction." European Conference on Machine Learning, 2000.](https://mlanthology.org/ecmlpkdd/2000/chidlovskii2000ecml-wrapper/) doi:10.1007/3-540-45164-1_11BibTeX
@inproceedings{chidlovskii2000ecml-wrapper,
title = {{Wrapper Generation via Grammar Induction}},
author = {Chidlovskii, Boris and Ragetli, Jon and de Rijke, Maarten},
booktitle = {European Conference on Machine Learning},
year = {2000},
pages = {96-108},
doi = {10.1007/3-540-45164-1_11},
url = {https://mlanthology.org/ecmlpkdd/2000/chidlovskii2000ecml-wrapper/}
}