The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank

Abstract

The PageRank algorithm, used in the Google search engine, greatly improves the results of Web search by taking into account the link structure of the Web. PageRank assigns to a page a score propor- tional to the number of times a random surfer would visit that page, if it surfed indefinitely from page to page, following all outlinks from a page with equal probability. We propose to improve Page- Rank by using a more intelligent surfer, one that is guided by a probabilistic model of the relevance of a page to a query. Efficient execution of our algorithm at query time is made possible by pre- computing at crawl time (and thus once for all queries) the neces- sary terms. Experiments on two large subsets of the Web indicate that our algorithm significantly outperforms PageRank in the (hu- man-rated) quality of the pages returned, while remaining efficient enough to be used in today’s large search engines.

Cite

Text

Richardson and Domingos. "The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank." Neural Information Processing Systems, 2001.

Markdown

[Richardson and Domingos. "The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank." Neural Information Processing Systems, 2001.](https://mlanthology.org/neurips/2001/richardson2001neurips-intelligent/)

BibTeX

@inproceedings{richardson2001neurips-intelligent,
  title     = {{The Intelligent Surfer: Probabilistic Combination of Link and Content Information in PageRank}},
  author    = {Richardson, Matthew and Domingos, Pedro},
  booktitle = {Neural Information Processing Systems},
  year      = {2001},
  pages     = {1441-1448},
  url       = {https://mlanthology.org/neurips/2001/richardson2001neurips-intelligent/}
}