Importance Sampling for Nonlinear Models

Abstract

While norm-based and leverage-score-based methods have been extensively studied for identifying "important" data points in linear models, analogous tools for nonlinear models remain significantly underdeveloped. By introducing the concept of the adjoint operator of a nonlinear map, we address this gap and generalize norm-based and leverage-score-based importance sampling to nonlinear settings. We demonstrate that sampling based on these generalized notions of norm and leverage scores provides approximation guarantees for the underlying nonlinear mapping, similar to linear subspace embeddings. As direct applications, these nonlinear scores not only reduce the computational complexity of training nonlinear models by enabling efficient sampling over large datasets but also offer a novel mechanism for model explainability and outlier detection. Our contributions are supported by both theoretical analyses and experimental results across a variety of supervised learning scenarios.

Cite

Text

Rajmohan and Roosta. "Importance Sampling for Nonlinear Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Rajmohan and Roosta. "Importance Sampling for Nonlinear Models." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/rajmohan2025icml-importance/)

BibTeX

@inproceedings{rajmohan2025icml-importance,
  title     = {{Importance Sampling for Nonlinear Models}},
  author    = {Rajmohan, Prakash Palanivelu and Roosta, Fred},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {51039-51059},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/rajmohan2025icml-importance/}
}