A Simple Model of Inference Scaling Laws

Abstract

Neural scaling laws have garnered significant interest due to their ability to predict model performance as a function of increasing parameters, data, and compute. In this work, we propose a simple statistical ansatz based on memorization to study scaling laws in the context of inference. Specifically, how performance improves with multiple inference attempts. We explore the coverage, or pass@k metric, which measures the chance of success over repeated attempts and provide a motivation for the observed functional form of the inference scaling behavior of the coverage in large language models (LLMs) on reasoning tasks. We then define an "inference loss", which exhibits a power-law decay as the number of trials increases, and connect this result with prompting costs. We further test the universality of our construction by conducting experiments on a simple generative model, and find that our predictions are in agreement with the empirical coverage curves in a controlled setting. Our simple framework sets the ground for incorporating inference scaling with other known scaling laws.

Cite

Text

Levi. "A Simple Model of Inference Scaling Laws." ICLR 2025 Workshops: DeLTa, 2025.

Markdown

[Levi. "A Simple Model of Inference Scaling Laws." ICLR 2025 Workshops: DeLTa, 2025.](https://mlanthology.org/iclrw/2025/levi2025iclrw-simple/)

BibTeX

@inproceedings{levi2025iclrw-simple,
  title     = {{A Simple Model of Inference Scaling Laws}},
  author    = {Levi, Noam Itzhak},
  booktitle = {ICLR 2025 Workshops: DeLTa},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/levi2025iclrw-simple/}
}