Contamination Budget: Trade-Offs Between Breadth, Depth and Difficulty

Abstract

Contamination in large language models (LLMs), and machine learning more broadly, refers to the inclusion of equal --or very similar-- examples in both training and test sets. This phenomenon usually translates into better test performance. Here we explore when this contamination is performed intentionally, for purposes that can be malicious (e.g., get better scores in evaluations) or benevolent (e.g., fix some mistakes). These interventions, usually in the form of fine-tuning memorisations, come with a budget in the size of the fine-tuning dataset. Several trade-offs appear between the breadth of the intervention (how many examples to be memorised), its depth (how many repetitions of each example) and the difficulty of the examples. By studying several LLMs and datasets, we observe some monotonic behaviour (more difficult items require more depth to be `fixed') but also some non-monotonic phenomena (very high depth levels have negative effects on non-contaminated examples). This suggests that trade-offs should be found not only in terms of the budget but also according to model specifics, the task and the item difficulty at hand.

Cite

Text

Mehrbakhsh et al. "Contamination Budget: Trade-Offs Between Breadth, Depth and Difficulty." International Joint Conference on Artificial Intelligence, 2025. doi:10.24963/IJCAI.2025/911

Markdown

[Mehrbakhsh et al. "Contamination Budget: Trade-Offs Between Breadth, Depth and Difficulty." International Joint Conference on Artificial Intelligence, 2025.](https://mlanthology.org/ijcai/2025/mehrbakhsh2025ijcai-contamination/) doi:10.24963/IJCAI.2025/911

BibTeX

@inproceedings{mehrbakhsh2025ijcai-contamination,
  title     = {{Contamination Budget: Trade-Offs Between Breadth, Depth and Difficulty}},
  author    = {Mehrbakhsh, Behzad and Martínez-Plumed, Fernando and Hernández-Orallo, José},
  booktitle = {International Joint Conference on Artificial Intelligence},
  year      = {2025},
  pages     = {8195-8203},
  doi       = {10.24963/IJCAI.2025/911},
  url       = {https://mlanthology.org/ijcai/2025/mehrbakhsh2025ijcai-contamination/}
}