Assessing Diversity Collapse in Reasoning

Abstract

We identify a striking phenomenon in large language models finetuned on reasoning tasks: as Pass@1 improves during supervised finetuning, Pass@k rapidly deteriorates and fails to recover with reinforcement learning or self-improvement. We formalize the relationship between expected Pass@k and Pass@1 over the test distribution and attribute the early drop in Pass@k to diversity collapse—where fine-tuning causes the probability mass to converge toward a single reasoning path and final answer for test questions. We theoretically prove how the standard finetuning strategy of SFT and RL leads to diversity collapse in reasoning models. Finally, we estimate the optimal Pass@k performance achievable with an oracle given access to the model's distribution over final answers marginalized over all rollouts and reveal a significant gap compared to current token-level diverse decoding methods such as temperature scale, top-k, nucleus, and min-p sampling. We highlight the need for better decoding strategies for generating reasoning steps during self-improvement and inference. Finally, we propose a promising solution by model weight interpolation.

Cite

Text

Dang et al. "Assessing Diversity Collapse in Reasoning." ICLR 2025 Workshops: SSI-FM, 2025.

Markdown

[Dang et al. "Assessing Diversity Collapse in Reasoning." ICLR 2025 Workshops: SSI-FM, 2025.](https://mlanthology.org/iclrw/2025/dang2025iclrw-assessing/)

BibTeX

@inproceedings{dang2025iclrw-assessing,
  title     = {{Assessing Diversity Collapse in Reasoning}},
  author    = {Dang, Xingyu and Baek, Christina and Kolter, J Zico and Raghunathan, Aditi},
  booktitle = {ICLR 2025 Workshops: SSI-FM},
  year      = {2025},
  url       = {https://mlanthology.org/iclrw/2025/dang2025iclrw-assessing/}
}