Towards Sampling Data Structures for Tensor Products in Turnstile Streams

Abstract

This paper studies the computational challenges of large-scale attention-based models in artificial intelligence by introducing innovative sampling methods in the streaming setting. Inspired by the classical definition of the $\ell_2$ sampler and the recent progress of the attention scheme in Large Language Models (LLMs), we propose the definition of the attention sampler. These attention samplers select the important coordinates in attention computation efficiently, bypassing the quadratic computational burden of computing the entire attention matrix. We demonstrate the effectiveness of the attention sampler from a theoretical perspective, including space and update time. Additionally, our framework exhibits scalability and broad applicability across various model architectures and domains.

Cite

Text

Song et al. "Towards Sampling Data Structures for Tensor Products in Turnstile Streams." International Conference on Learning Representations, 2026.

Markdown

[Song et al. "Towards Sampling Data Structures for Tensor Products in Turnstile Streams." International Conference on Learning Representations, 2026.](https://mlanthology.org/iclr/2026/song2026iclr-sampling/)

BibTeX

@inproceedings{song2026iclr-sampling,
  title     = {{Towards Sampling Data Structures for Tensor Products in Turnstile Streams}},
  author    = {Song, Zhao and Xie, Shenghao and Zhou, Samson},
  booktitle = {International Conference on Learning Representations},
  year      = {2026},
  url       = {https://mlanthology.org/iclr/2026/song2026iclr-sampling/}
}