Optimizing Temperature for Language Models with Multi-Sample Inference

ICML 2025 pp. 14648-14668

/icml/2025/du2025icml-optimizing/

Abstract

Multi-sample aggregation strategies, such as majority voting and best-of-N sampling, are widely used in contemporary large language models (LLMs) to enhance predictive accuracy across various tasks. A key challenge in this process is temperature selection, which significantly impacts model performance. Existing approaches either rely on a fixed default temperature or require labeled validation data for tuning, which are often scarce and difficult to obtain. This paper addresses the challenge of automatically identifying the (near)-optimal temperature for different LLMs using multi-sample aggregation strategies, without relying on task-specific validation data. We provide a comprehensive analysis of temperature’s role in performance optimization, considering variations in model architectures, datasets, task types, model sizes, and predictive accuracy. Furthermore, we propose a novel entropy-based metric for automated temperature optimization, which consistently outperforms fixed-temperature baselines. Additionally, we incorporate a stochastic process model to enhance interpretability, offering deeper insights into the relationship between temperature and model performance.

PDF ICML OpenReview Semantic Scholar

Cite

Text

Du et al. "Optimizing Temperature for Language Models with Multi-Sample Inference." Proceedings of the 42nd International Conference on Machine Learning, 2025.

Markdown

[Du et al. "Optimizing Temperature for Language Models with Multi-Sample Inference." Proceedings of the 42nd International Conference on Machine Learning, 2025.](https://mlanthology.org/icml/2025/du2025icml-optimizing/)

BibTeX

@inproceedings{du2025icml-optimizing,
  title     = {{Optimizing Temperature for Language Models with Multi-Sample Inference}},
  author    = {Du, Weihua and Yang, Yiming and Welleck, Sean},
  booktitle = {Proceedings of the 42nd International Conference on Machine Learning},
  year      = {2025},
  pages     = {14648-14668},
  volume    = {267},
  url       = {https://mlanthology.org/icml/2025/du2025icml-optimizing/}
}