Fast Training of Large Kernel Models with Delayed Projections

Abstract

Classical kernel machines have historically faced significant challenges in scaling to large datasets and model sizes—a key ingredient that has driven the success of neural networks. In this paper, we present a new methodology for building kernel machines that can scale efficiently with both data size and model size. Our algorithm introduces delayed projections to Preconditioned Stochastic Gradient Descent (PSGD) allowing the training of much larger models than was previously feasible. We validate our algorithm, \EP4, across multiple datasets, demonstrating drastic training speedups without compromising the performance. Our implementation is publicly available at: https://github.com/EigenPro/EigenPro .

PDF NeurIPS OpenReview Semantic Scholar

Cite

Text

Abedsoltan et al. "Fast Training of Large Kernel Models with Delayed Projections." Advances in Neural Information Processing Systems, 2025.

Markdown

[Abedsoltan et al. "Fast Training of Large Kernel Models with Delayed Projections." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/abedsoltan2025neurips-fast/)

BibTeX

@inproceedings{abedsoltan2025neurips-fast,
  title     = {{Fast Training of Large Kernel Models with Delayed Projections}},
  author    = {Abedsoltan, Amirhesam and Ma, Siyuan and Pandit, Parthe and Belkin, Mikhail},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/abedsoltan2025neurips-fast/}
}