Dimension Reduction for High-Dimensional Small Counts with KL Divergence

Abstract

Dimension reduction for high-dimensional count data with a large proportion of zeros is an important task in various applications. As a large number of dimension reduction methods rely on the proximity measure, we develop a dissimilarity measure that is well-suited for small counts based on the Kullback-Leibler divergence. We compare the proposed measure with other widely used dissimilarity measures and show that the proposed one has superior discriminative ability when applied to high-dimensional count data having an excess of zeros. Extensive empirical results, on both simulated and publicly-available real-world datasets that contain many zeros, demonstrate that the proposed dissimilarity measure can improve a wide range of dimension reduction methods.

Cite

Text

Ling and Xue. "Dimension Reduction for High-Dimensional Small Counts with KL Divergence." Uncertainty in Artificial Intelligence, 2022.

Markdown

[Ling and Xue. "Dimension Reduction for High-Dimensional Small Counts with KL Divergence." Uncertainty in Artificial Intelligence, 2022.](https://mlanthology.org/uai/2022/ling2022uai-dimension/)

BibTeX

@inproceedings{ling2022uai-dimension,
  title     = {{Dimension Reduction for High-Dimensional Small Counts with KL Divergence}},
  author    = {Ling, Yurong and Xue, Jing-Hao},
  booktitle = {Uncertainty in Artificial Intelligence},
  year      = {2022},
  pages     = {1210-1220},
  volume    = {180},
  url       = {https://mlanthology.org/uai/2022/ling2022uai-dimension/}
}