NewsWeeder: Learning to Filter Netnews

Abstract

A significant problem in many information filtering systems is the dependence on the user for the creation and maintenance of a user profile, which describes the user's interests. NewsWeeder is a netnews-filtering system that addresses this problem by letting the user rate his or her interest level for each article being read (1-5), and then learning a user profile based on these ratings. This paper describes how NewsWeeder accomplishes this task, and examines the alternative learning methods used. The results show that a learning algorithm based on the Minimum Description Length (MDL) principle was able to raise the percentage of interesting articles to be shown to users from 14% to 52% on average. Further, this performance significantly outperformed (by 21%) one of the most successful techniques in Information Retrieval (IR), term-frequency/inverse-document-frequency (tf-idf) weighting.

Cite

Text

Lang. "NewsWeeder: Learning to Filter Netnews." International Conference on Machine Learning, 1995. doi:10.1016/B978-1-55860-377-6.50048-7

Markdown

[Lang. "NewsWeeder: Learning to Filter Netnews." International Conference on Machine Learning, 1995.](https://mlanthology.org/icml/1995/lang1995icml-newsweeder/) doi:10.1016/B978-1-55860-377-6.50048-7

BibTeX

@inproceedings{lang1995icml-newsweeder,
  title     = {{NewsWeeder: Learning to Filter Netnews}},
  author    = {Lang, Ken},
  booktitle = {International Conference on Machine Learning},
  year      = {1995},
  pages     = {331-339},
  doi       = {10.1016/B978-1-55860-377-6.50048-7},
  url       = {https://mlanthology.org/icml/1995/lang1995icml-newsweeder/}
}