List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression

Abstract

We study a relaxation of the problem of coupling probability distributions — a list of samples is generated from one distribution and an *accept* is declared if any one of these samples is identical to the sample generated from the other distribution. We propose a novel method for generating samples, which extends the Gumbel-max sampling suggested in Daliri et al. (2025) for coupling probability distributions. We also establish a corresponding lower bound on the acceptance probability, which we call the *list matching lemma*. We next discuss two applications of our setup. First, we develop a new mechanism for multi-draft speculative sampling that is simple to implement and achieves performance competitive with baselines such as SpecTr and SpecInfer across a range of language tasks. Our method also guarantees a certain degree of *drafter invariance* with respect to the output tokens which is not supported by existing schemes. We also provide a theoretical lower bound on the token level acceptance probability. As our second application, we consider distributed lossy compression with side information in a setting where a source sample is compressed and available to multiple decoders, each with independent side information. We propose a compression technique that is based on our generalization of Gumbel-max sampling and show that it provides significant gains in experiments involving synthetic Gaussian sources and the MNIST image dataset.

Cite

Text

Rowan et al. "List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression." Advances in Neural Information Processing Systems, 2025.

Markdown

[Rowan et al. "List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression." Advances in Neural Information Processing Systems, 2025.](https://mlanthology.org/neurips/2025/rowan2025neurips-listlevel/)

BibTeX

@inproceedings{rowan2025neurips-listlevel,
  title     = {{List-Level Distribution Coupling with Applications to Speculative Decoding and Lossy Compression}},
  author    = {Rowan, Joseph and Phan, Buu and Khisti, Ashish J},
  booktitle = {Advances in Neural Information Processing Systems},
  year      = {2025},
  url       = {https://mlanthology.org/neurips/2025/rowan2025neurips-listlevel/}
}