Kernel Operations on the GPU, with Autodiff, Without Memory Overflows

Abstract

The KeOps library provides a fast and memory-efficient GPU support for tensors whose entries are given by a mathematical formula, such as kernel and distance matrices. KeOps alleviates the main bottleneck of tensor-centric libraries for kernel and geometric applications: memory consumption. It also supports automatic differentiation and outperforms standard GPU baselines, including PyTorch CUDA tensors or the Halide and TVM libraries. KeOps combines optimized C++/CUDA schemes with binders for high-level languages: Python (Numpy and PyTorch), Matlab and GNU R. As a result, high-level “quadratic” codes can now scale up to large data sets with millions of samples processed in seconds. KeOps brings graphics-like performances for kernel methods and is freely available on standard repositories (PyPi, CRAN). To showcase its versatility, we provide tutorials in a wide range of settings online at www.kernel-operations.io.

Cite

Text

Charlier et al. "Kernel Operations on the GPU, with Autodiff, Without Memory Overflows." Machine Learning Open Source Software, 2021.

Markdown

[Charlier et al. "Kernel Operations on the GPU, with Autodiff, Without Memory Overflows." Machine Learning Open Source Software, 2021.](https://mlanthology.org/mloss/2021/charlier2021jmlr-kernel/)

BibTeX

@article{charlier2021jmlr-kernel,
  title     = {{Kernel Operations on the GPU, with Autodiff, Without Memory Overflows}},
  author    = {Charlier, Benjamin and Feydy, Jean and Glaunès, Joan Alexis and Collin, François-David and Durif, Ghislain},
  journal   = {Machine Learning Open Source Software},
  year      = {2021},
  pages     = {1-6},
  volume    = {22},
  url       = {https://mlanthology.org/mloss/2021/charlier2021jmlr-kernel/}
}