A Simple Model of Inference Scaling Laws
Abstract
Neural scaling laws have garnered significant interest due to their ability to predict model performance as a function of increasing parameters, data, and compute. In this work, we propose a simple statistical ansatz based on memorization to study scaling laws in the context of inference. Specifically, how performance improves with multiple inference attempts. We explore the coverage, or pass@k metric, which measures the chance of success over repeated attempts and provide a motivation for the observed functional form of the inference scaling behavior of the coverage in large language models (LLMs) on reasoning tasks. We then define an "inference loss", which exhibits a power-law decay as the number of trials increases, and connect this result with prompting costs. We further test the universality of our construction by conducting experiments on a simple generative model, and find that our predictions are in agreement with the empirical coverage curves in a controlled setting. Our simple framework sets the ground for incorporating inference scaling with other known scaling laws.
Cite
Text
Levi. "A Simple Model of Inference Scaling Laws." ICLR 2025 Workshops: DeLTa, 2025.Markdown
[Levi. "A Simple Model of Inference Scaling Laws." ICLR 2025 Workshops: DeLTa, 2025.](https://mlanthology.org/iclrw/2025/levi2025iclrw-simple/)BibTeX
@inproceedings{levi2025iclrw-simple,
title = {{A Simple Model of Inference Scaling Laws}},
author = {Levi, Noam Itzhak},
booktitle = {ICLR 2025 Workshops: DeLTa},
year = {2025},
url = {https://mlanthology.org/iclrw/2025/levi2025iclrw-simple/}
}