A Probabilistic Model for Text Kernels

Lehmann, Alain D.; Shawe-Taylor, John

doi:10.1145/1143844.1143912

A Probabilistic Model for Text Kernels

Alain D. Lehmann, John Shawe-Taylor

ICML 2006 pp. 537-544

doi:10.1145/1143844.1143912 /icml/2006/lehmann2006icml-probabilistic/

Abstract

This paper explores several kernels in the context of text classification. A novel view of how documents might have been created is introduced and kernels are derived from this framework. The relations between these kernels as well as to the Gaussian kernel are discussed. Moreover, the popular tf-idf weighting scheme will be derived as a natural consequence. Finally, the kernels have been evaluated on the Reuters Corpus Volume I newswire database to assess their quality in a topic classification application.

PDF ICML Semantic Scholar

Cite

Text

Lehmann and Shawe-Taylor. "A Probabilistic Model for Text Kernels." International Conference on Machine Learning, 2006. doi:10.1145/1143844.1143912

Markdown

[Lehmann and Shawe-Taylor. "A Probabilistic Model for Text Kernels." International Conference on Machine Learning, 2006.](https://mlanthology.org/icml/2006/lehmann2006icml-probabilistic/) doi:10.1145/1143844.1143912

BibTeX

@inproceedings{lehmann2006icml-probabilistic,
  title     = {{A Probabilistic Model for Text Kernels}},
  author    = {Lehmann, Alain D. and Shawe-Taylor, John},
  booktitle = {International Conference on Machine Learning},
  year      = {2006},
  pages     = {537-544},
  doi       = {10.1145/1143844.1143912},
  url       = {https://mlanthology.org/icml/2006/lehmann2006icml-probabilistic/}
}