Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text

Abstract

Detecting text generated by modern large language models is thought to be hard, as both LLMs and humans can exhibit a wide range of complex behaviors. However, we find that a score based on contrasting two closely related language models is highly accurate at separating human-generated and machine-generated text. Based on this mechanism, we propose a novel LLM detector that only requires simple calculations using a pair of pre-trained LLMs. The method, called Binoculars, achieves state-of-the-art accuracy without any training data. It is capable of spotting machine text from a range of modern LLMs without any model-specific modifications. We comprehensively evaluate Binoculars on a number of text sources and in varied situations. Over a wide range of document types, Binoculars detects over 90% of generated samples from ChatGPT (and other LLMs) at a false positive rate of 0.01%, despite not being trained on any ChatGPT data. Code available at https://github.com/ahans30/Binoculars.

Cite

Text

Hans et al. "Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text." International Conference on Machine Learning, 2024.

Markdown

[Hans et al. "Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text." International Conference on Machine Learning, 2024.](https://mlanthology.org/icml/2024/hans2024icml-spotting/)

BibTeX

@inproceedings{hans2024icml-spotting,
  title     = {{Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text}},
  author    = {Hans, Abhimanyu and Schwarzschild, Avi and Cherepanova, Valeriia and Kazemi, Hamid and Saha, Aniruddha and Goldblum, Micah and Geiping, Jonas and Goldstein, Tom},
  booktitle = {International Conference on Machine Learning},
  year      = {2024},
  pages     = {17519-17537},
  volume    = {235},
  url       = {https://mlanthology.org/icml/2024/hans2024icml-spotting/}
}