# Fast-DetectGPT **This code is for ICLR 2024 paper "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature"**, where we borrow or extend some code from [DetectGPT](https://github.com/eric-mitchell/detect-gpt). [Paper](https://arxiv.org/abs/2310.05130) | [LocalDemo](#local-demo) | [OnlineDemo](http://region-9.autodl.pro:21504/) | [OpenReview](https://openreview.net/forum?id=Bpcgcr8E8Z) ## Brief Intro

Method	5-Model Generations ↑	ChatGPT/GPT-4 Generations ↑	Speedup ↑
DetectGPT	0.9554	0.7225	1x
Fast-DetectGPT	0.9887 (relative↑ 74.7%)	0.9338 (relative↑ 76.1%)	340x

The table shows detection accuracy (measured in AUROC) and computational speedup for machine-generated text detection. The white-box setting (directly using the source model) is used for detecting generations produced by five source models (5-model), whereas the black-box setting (utilizing surrogate models) targets ChatGPT and GPT-4 generations. AUROC results are averaged across various datasets and source models. Speedup assessments were conducted on a Tesla A100 GPU. ## Environment * Python3.8 * PyTorch1.10.0 * Setup the environment: ```bash setup.sh``` (Notes: our experiments are run on 1 GPU of Tesla A100 with 80G memory.) ## Local Demo Please run following command locally for an interactive demo: ``` python scripts/local_infer.py ``` where the default reference and sampling models are both gpt-neo-2.7B. We could use gpt-j-6B as the reference model to obtain more accurate detections: ``` python scripts/local_infer.py --reference_model_name gpt-j-6B ``` An example (using gpt-j-6B as the reference model) looks like ``` Please enter your text: (Press Enter twice to start processing) Disguised as police, they broke through a fence on Monday evening and broke into the cargo of a Swiss-bound plane to take the valuable items. The audacious heist occurred at an airport in a small European country, leaving authorities baffled and airline officials in shock. Fast-DetectGPT criterion is 1.9299, suggesting that the text has a probability of 87% to be machine-generated. ``` ## Workspace Following folders are created for our experiments: * ./exp_main -> experiments for 5-model generations (main.sh). * ./exp_gpt3to4 -> experiments for GPT-3, ChatGPT, and GPT-4 generations (gpt3to4.sh). (Notes: we share generations from GPT-3, ChatGPT, and GPT-4 in exp_gpt3to4/data for convenient reproduction.) ### Citation If you find this work useful, you can cite it with the following BibTex entry: @inproceedings{bao2023fast, title={Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature}, author={Bao, Guangsheng and Zhao, Yanbin and Teng, Zhiyang and Yang, Linyi and Zhang, Yue}, booktitle={The Twelfth International Conference on Learning Representations}, year={2023} }