Extracting Relevant Snippets for Web Navigation

Abstract

Search engines present fix-length passages from documents ranked by relevance against the query. In this paper, we present and compare novel, language-model based methods for extracting variable length document snippets by real-time processing of documents using the query issued by the user. With this extra level of information, the returned snippets are considerably more informative. Unlike previous work on passage retrieval which relies on searching relevant segments for filtering of preoccupied passages, we focus on query-informed segmentation to extract context-aware relevant snippets with variable length. In particular, we show that, when informed through an appropriate relevance language model, curvature analysis and Hidden Markov model (HMM) based content segmentation techniques can facilitate to extract relevant document snippets.

Cite

Text

Li et al. "Extracting Relevant Snippets for Web Navigation." AAAI Conference on Artificial Intelligence, 2008.

Markdown

[Li et al. "Extracting Relevant Snippets for Web Navigation." AAAI Conference on Artificial Intelligence, 2008.](https://mlanthology.org/aaai/2008/li2008aaai-extracting/)

BibTeX

@inproceedings{li2008aaai-extracting,
  title     = {{Extracting Relevant Snippets for Web Navigation}},
  author    = {Li, Qing and Candan, K. Selçuk and Yan, Qi},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2008},
  pages     = {1195-1200},
  url       = {https://mlanthology.org/aaai/2008/li2008aaai-extracting/}
}