Diffusion-Based Methods for Estimating Curvature in Data

Abstract

High-throughput high-dimensional data is now being generated in massive quantities in many fields including biology, medicine, chemistry, finance, and physics. Researchers have successfully used manifold learning in order to gain insight from such data, particularly in biomedical and single-cell data. One such technique, data diffusion geometry, has been useful in understanding manifold intrinsic distances, density, and major non-linear axes or paths through the data. However, a relatively unstudied feature of high-dimensional data is curvature. While curvature is well-defined and easy to compute in low dimensions, it poses computational and conceptual difficulties in high dimensions. Here, we present two techniques to estimate curvature from high-dimensional data starting from data diffusion probabilities. The first technique, diffusion curvature, uses the spread or conversely laziness of a random walk to estimate curvature pointwise in data. The second technique, deep diffusion curvature, trains a neural network to estimate pointwise curvature. Since these techniques are scalable, we anticipate that they can be used to describe and compare datasets as well as find points in data that represent transitional entities.

Cite

Text

Bhaskar et al. "Diffusion-Based Methods for Estimating Curvature in Data." ICLR 2022 Workshops: GTRL, 2022.

Markdown

[Bhaskar et al. "Diffusion-Based Methods for Estimating Curvature in Data." ICLR 2022 Workshops: GTRL, 2022.](https://mlanthology.org/iclrw/2022/bhaskar2022iclrw-diffusionbased/)

BibTeX

@inproceedings{bhaskar2022iclrw-diffusionbased,
  title     = {{Diffusion-Based Methods for Estimating Curvature in Data}},
  author    = {Bhaskar, Dhananjay and MacDonald, Kincaid and Thomas, Dawson and Zhao, Sarah and You, Kisung and Paige, Jennifer and Aizenbud, Yariv and Rieck, Bastian and Adelstein, Ian M and Krishnaswamy, Smita},
  booktitle = {ICLR 2022 Workshops: GTRL},
  year      = {2022},
  url       = {https://mlanthology.org/iclrw/2022/bhaskar2022iclrw-diffusionbased/}
}