Feature Hashing for Large Scale Multitask Learning

Abstract

Empirical evidence suggests that hashing is an effective strategy for dimensionality reduction and practical nonparametric estimation. In this paper we provide exponential tail bounds for feature hashing and show that the interaction between random subspaces is negligible with high probability. We demonstrate the feasibility of this approach with experimental results for a new use case --- multitask learning with hundreds of thousands of tasks.

Cite

Text

Weinberger et al. "Feature Hashing for Large Scale Multitask Learning." International Conference on Machine Learning, 2009. doi:10.1145/1553374.1553516

Markdown

[Weinberger et al. "Feature Hashing for Large Scale Multitask Learning." International Conference on Machine Learning, 2009.](https://mlanthology.org/icml/2009/weinberger2009icml-feature/) doi:10.1145/1553374.1553516

BibTeX

@inproceedings{weinberger2009icml-feature,
  title     = {{Feature Hashing for Large Scale Multitask Learning}},
  author    = {Weinberger, Kilian Q. and Dasgupta, Anirban and Langford, John and Smola, Alexander J. and Attenberg, Josh},
  booktitle = {International Conference on Machine Learning},
  year      = {2009},
  pages     = {1113-1120},
  doi       = {10.1145/1553374.1553516},
  url       = {https://mlanthology.org/icml/2009/weinberger2009icml-feature/}
}