Hierarchically Supervised Latent Dirichlet Allocation
Abstract
We introduce hierarchically supervised latent Dirichlet allocation (HSLDA), a model for hierarchically and multiply labeled bag-of-word data. Examples of such data include web pages and their placement in directories, product descriptions and associated categories from product hierarchies, and free-text clinical records and their assigned diagnosis codes. Out-of-sample label prediction is the primary goal of this work, but improved lower-dimensional representations of the bag-of-word data are also of interest. We demonstrate HSLDA on large-scale data from clinical document labeling and retail product categorization tasks. We show that leveraging the structure from hierarchical labels improves out-of-sample label prediction substantially when compared to models that do not.
Cite
Text
Perotte et al. "Hierarchically Supervised Latent Dirichlet Allocation." Neural Information Processing Systems, 2011.Markdown
[Perotte et al. "Hierarchically Supervised Latent Dirichlet Allocation." Neural Information Processing Systems, 2011.](https://mlanthology.org/neurips/2011/perotte2011neurips-hierarchically/)BibTeX
@inproceedings{perotte2011neurips-hierarchically,
title = {{Hierarchically Supervised Latent Dirichlet Allocation}},
author = {Perotte, Adler J. and Wood, Frank and Elhadad, Noemie and Bartlett, Nicholas},
booktitle = {Neural Information Processing Systems},
year = {2011},
pages = {2609-2617},
url = {https://mlanthology.org/neurips/2011/perotte2011neurips-hierarchically/}
}