ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)

AAAI 2024 pp. 23688-23689

doi:10.1609/AAAI.V38I21.30527 /aaai/2024/xu2024aaai-chatgpt/

Abstract

In the era of large language models like Chatgpt, maintaining academic integrity in programming education has become challenging due to potential misuse. There's a pressing need for reliable detectors to identify Chatgpt-generated code. While previous studies have tackled model-generated text detection, identifying such code remains uncharted territory. In this paper, we introduce a novel method to discern Chatgpt-generated code. We employ targeted masking perturbation, emphasizing code sections with high perplexity. Fine-tuned CodeBERT is utilized to replace these masked sections, generating subtly perturbed samples. Our scoring system amalgamates overall perplexity, variations in code line perplexity, and burstiness. In this scoring scheme, a higher rank for the original code suggests it's more likely to be chatgpt-generated. The underlying principle is that code generated by models typically exhibits consistent, low perplexity and reduced burstiness, with its ranking remaining relatively stable even after subtle modifications. In contrast, human-written code, when perturbed, is more likely to produce samples that the model prefers. Our approach significantly outperforms current detectors, especially against OpenAI's text-davinci-003 model, with the average AUC rising from 0.56 (GPTZero baseline) to 0.87.

PDF AAAI Semantic Scholar

Cite

Text

Xu et al. "ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024. doi:10.1609/AAAI.V38I21.30527

Markdown

[Xu et al. "ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)." AAAI Conference on Artificial Intelligence, 2024.](https://mlanthology.org/aaai/2024/xu2024aaai-chatgpt/) doi:10.1609/AAAI.V38I21.30527

BibTeX

@inproceedings{xu2024aaai-chatgpt,
  title     = {{ChatGPT-Generated Code Assignment Detection Using Perplexity of Large Language Models (Student Abstract)}},
  author    = {Xu, Zhenyu and Xu, Ruoyu and Sheng, Victor S.},
  booktitle = {AAAI Conference on Artificial Intelligence},
  year      = {2024},
  pages     = {23688-23689},
  doi       = {10.1609/AAAI.V38I21.30527},
  url       = {https://mlanthology.org/aaai/2024/xu2024aaai-chatgpt/}
}