Spaces:
Running
Running
Upload 30 files
Browse files- LICENSE +21 -0
- README.md +80 -14
- attack.sh +85 -0
- baselines.py +137 -0
- custom_datasets.py +96 -0
- data_builder.py +276 -0
- data_truncator.py +97 -0
- detect_gpt.py +295 -0
- detect_llm.py +128 -0
- detector.py +11 -0
- dna_gpt.py +211 -0
- fast_detect_gpt.py +162 -0
- gpt3to4.sh +116 -0
- gptzero.py +84 -0
- index.html +106 -0
- local_infer.py +94 -0
- main.sh +97 -0
- main_ext.sh +89 -0
- metrics.py +26 -0
- model.py +79 -0
- paraphrasing.py +106 -0
- report_results.py +490 -0
- requirements.txt +8 -3
- setup.sh +1 -0
- show_result.py +51 -0
- supervised.py +78 -0
- supervised.sh +56 -0
- temperature.sh +88 -0
- topk.sh +88 -0
- topp.sh +88 -0
LICENSE
ADDED
@@ -0,0 +1,21 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
MIT License
|
2 |
+
|
3 |
+
Copyright (c) 2023 Bao Guangsheng
|
4 |
+
|
5 |
+
Permission is hereby granted, free of charge, to any person obtaining a copy
|
6 |
+
of this software and associated documentation files (the "Software"), to deal
|
7 |
+
in the Software without restriction, including without limitation the rights
|
8 |
+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
|
9 |
+
copies of the Software, and to permit persons to whom the Software is
|
10 |
+
furnished to do so, subject to the following conditions:
|
11 |
+
|
12 |
+
The above copyright notice and this permission notice shall be included in all
|
13 |
+
copies or substantial portions of the Software.
|
14 |
+
|
15 |
+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
|
16 |
+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
|
17 |
+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
|
18 |
+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
|
19 |
+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
|
20 |
+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
|
21 |
+
SOFTWARE.
|
README.md
CHANGED
@@ -1,14 +1,80 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
4 |
-
|
5 |
-
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Fast-DetectGPT
|
2 |
+
**This code is for ICLR 2024 paper "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature"**, where we borrow or extend some code from [DetectGPT](https://github.com/eric-mitchell/detect-gpt).
|
3 |
+
|
4 |
+
[Paper](https://arxiv.org/abs/2310.05130)
|
5 |
+
| [LocalDemo](#local-demo)
|
6 |
+
| [OnlineDemo](http://region-9.autodl.pro:21504/)
|
7 |
+
| [OpenReview](https://openreview.net/forum?id=Bpcgcr8E8Z)
|
8 |
+
|
9 |
+
|
10 |
+
## Brief Intro
|
11 |
+
<table class="tg" style="padding-left: 30px;">
|
12 |
+
<tr>
|
13 |
+
<th class="tg-0pky">Method</th>
|
14 |
+
<th class="tg-0pky">5-Model Generations ↑</th>
|
15 |
+
<th class="tg-0pky">ChatGPT/GPT-4 Generations ↑</th>
|
16 |
+
<th class="tg-0pky">Speedup ↑</th>
|
17 |
+
</tr>
|
18 |
+
<tr>
|
19 |
+
<td class="tg-0pky">DetectGPT</td>
|
20 |
+
<td class="tg-0pky">0.9554</td>
|
21 |
+
<td class="tg-0pky">0.7225</td>
|
22 |
+
<td class="tg-0pky">1x</td>
|
23 |
+
</tr>
|
24 |
+
<tr>
|
25 |
+
<td class="tg-0pky">Fast-DetectGPT</td>
|
26 |
+
<td class="tg-0pky">0.9887 (relative↑ <b>74.7%</b>)</td>
|
27 |
+
<td class="tg-0pky">0.9338 (relative↑ <b>76.1%</b>)</td>
|
28 |
+
<td class="tg-0pky"><b>340x</b></td>
|
29 |
+
</tr>
|
30 |
+
</table>
|
31 |
+
The table shows detection accuracy (measured in AUROC) and computational speedup for machine-generated text detection. The <b>white-box setting</b> (directly using the source model) is used for detecting generations produced by five source models (5-model), whereas the <b>black-box
|
32 |
+
setting</b> (utilizing surrogate models) targets ChatGPT and GPT-4 generations. AUROC results are averaged across various datasets and source models. Speedup assessments were conducted on a Tesla A100 GPU.
|
33 |
+
|
34 |
+
|
35 |
+
## Environment
|
36 |
+
* Python3.8
|
37 |
+
* PyTorch1.10.0
|
38 |
+
* Setup the environment:
|
39 |
+
```bash setup.sh```
|
40 |
+
|
41 |
+
(Notes: our experiments are run on 1 GPU of Tesla A100 with 80G memory.)
|
42 |
+
|
43 |
+
## Local Demo
|
44 |
+
Please run following command locally for an interactive demo:
|
45 |
+
```
|
46 |
+
python scripts/local_infer.py
|
47 |
+
```
|
48 |
+
where the default reference and sampling models are both gpt-neo-2.7B.
|
49 |
+
|
50 |
+
We could use gpt-j-6B as the reference model to obtain more accurate detections:
|
51 |
+
```
|
52 |
+
python scripts/local_infer.py --reference_model_name gpt-j-6B
|
53 |
+
```
|
54 |
+
|
55 |
+
|
56 |
+
An example (using gpt-j-6B as the reference model) looks like
|
57 |
+
```
|
58 |
+
Please enter your text: (Press Enter twice to start processing)
|
59 |
+
Disguised as police, they broke through a fence on Monday evening and broke into the cargo of a Swiss-bound plane to take the valuable items. The audacious heist occurred at an airport in a small European country, leaving authorities baffled and airline officials in shock.
|
60 |
+
|
61 |
+
Fast-DetectGPT criterion is 1.9299, suggesting that the text has a probability of 87% to be machine-generated.
|
62 |
+
```
|
63 |
+
|
64 |
+
## Workspace
|
65 |
+
Following folders are created for our experiments:
|
66 |
+
* ./exp_main -> experiments for 5-model generations (main.sh).
|
67 |
+
* ./exp_gpt3to4 -> experiments for GPT-3, ChatGPT, and GPT-4 generations (gpt3to4.sh).
|
68 |
+
|
69 |
+
(Notes: we share <b>generations from GPT-3, ChatGPT, and GPT-4</b> in exp_gpt3to4/data for convenient reproduction.)
|
70 |
+
|
71 |
+
### Citation
|
72 |
+
If you find this work useful, you can cite it with the following BibTex entry:
|
73 |
+
|
74 |
+
@inproceedings{bao2023fast,
|
75 |
+
title={Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature},
|
76 |
+
author={Bao, Guangsheng and Zhao, Yanbin and Teng, Zhiyang and Yang, Linyi and Zhang, Yue},
|
77 |
+
booktitle={The Twelfth International Conference on Learning Representations},
|
78 |
+
year={2023}
|
79 |
+
}
|
80 |
+
|
attack.sh
ADDED
@@ -0,0 +1,85 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
para=t5 # "t5" for paraphrasing attack, or "random" for decoherence attack
|
13 |
+
exp_path=exp_attack
|
14 |
+
data_path=$exp_path/data
|
15 |
+
res_path=$exp_path/results
|
16 |
+
mkdir -p $exp_path $data_path $res_path
|
17 |
+
|
18 |
+
src_path=exp_gpt3to4
|
19 |
+
src_data_path=$src_path/data
|
20 |
+
|
21 |
+
datasets="xsum writing pubmed"
|
22 |
+
source_models="gpt-3.5-turbo"
|
23 |
+
|
24 |
+
# preparing dataset
|
25 |
+
for D in $datasets; do
|
26 |
+
for M in $source_models; do
|
27 |
+
echo `date`, Preparing dataset ${D}_${M} by paraphrasing ${src_data_path}/${D}_${M} ...
|
28 |
+
python scripts/paraphrasing.py --dataset $D --dataset_file $src_data_path/${D}_${M} \
|
29 |
+
--paraphraser $para --output_file $data_path/${D}_${M}
|
30 |
+
done
|
31 |
+
done
|
32 |
+
|
33 |
+
# evaluate Fast-DetectGPT in the black-box setting
|
34 |
+
settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
|
35 |
+
for D in $datasets; do
|
36 |
+
for M in $source_models; do
|
37 |
+
for S in $settings; do
|
38 |
+
IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
|
39 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
40 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
|
41 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
42 |
+
done
|
43 |
+
done
|
44 |
+
done
|
45 |
+
|
46 |
+
# evaluate supervised detectors
|
47 |
+
supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
|
48 |
+
for D in $datasets; do
|
49 |
+
for M in $source_models; do
|
50 |
+
for SM in $supervised_models; do
|
51 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
52 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
53 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
54 |
+
done
|
55 |
+
done
|
56 |
+
done
|
57 |
+
|
58 |
+
# evaluate fast baselines
|
59 |
+
scoring_models="gpt-neo-2.7B"
|
60 |
+
for D in $datasets; do
|
61 |
+
for M in $source_models; do
|
62 |
+
for M2 in $scoring_models; do
|
63 |
+
echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
|
64 |
+
python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
|
65 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
66 |
+
done
|
67 |
+
done
|
68 |
+
done
|
69 |
+
|
70 |
+
# evaluate DetectGPT and DetectLLM
|
71 |
+
scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
|
72 |
+
for D in $datasets; do
|
73 |
+
for M in $source_models; do
|
74 |
+
M1=t5-11b # perturbation model
|
75 |
+
for M2 in $scoring_models; do
|
76 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
77 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
78 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
79 |
+
# we leverage DetectGPT to generate the perturbations
|
80 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
81 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
82 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
83 |
+
done
|
84 |
+
done
|
85 |
+
done
|
baselines.py
ADDED
@@ -0,0 +1,137 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
import numpy as np
|
7 |
+
import torch
|
8 |
+
import torch.nn.functional as F
|
9 |
+
import tqdm
|
10 |
+
import argparse
|
11 |
+
import json
|
12 |
+
from data_builder import load_data
|
13 |
+
from model import load_tokenizer, load_model
|
14 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
15 |
+
|
16 |
+
def get_likelihood(logits, labels):
|
17 |
+
assert logits.shape[0] == 1
|
18 |
+
assert labels.shape[0] == 1
|
19 |
+
|
20 |
+
logits = logits.view(-1, logits.shape[-1])
|
21 |
+
labels = labels.view(-1)
|
22 |
+
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
|
23 |
+
log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
|
24 |
+
return log_likelihood.mean().item()
|
25 |
+
|
26 |
+
def get_rank(logits, labels):
|
27 |
+
assert logits.shape[0] == 1
|
28 |
+
assert labels.shape[0] == 1
|
29 |
+
|
30 |
+
# get rank of each label token in the model's likelihood ordering
|
31 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
32 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
33 |
+
|
34 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
35 |
+
|
36 |
+
# make sure we got exactly one match for each timestep in the sequence
|
37 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
38 |
+
|
39 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
40 |
+
return -ranks.mean().item()
|
41 |
+
|
42 |
+
def get_logrank(logits, labels):
|
43 |
+
assert logits.shape[0] == 1
|
44 |
+
assert labels.shape[0] == 1
|
45 |
+
|
46 |
+
# get rank of each label token in the model's likelihood ordering
|
47 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
48 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
49 |
+
|
50 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
51 |
+
|
52 |
+
# make sure we got exactly one match for each timestep in the sequence
|
53 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
54 |
+
|
55 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
56 |
+
ranks = torch.log(ranks)
|
57 |
+
return -ranks.mean().item()
|
58 |
+
|
59 |
+
def get_entropy(logits, labels):
|
60 |
+
assert logits.shape[0] == 1
|
61 |
+
assert labels.shape[0] == 1
|
62 |
+
|
63 |
+
entropy = F.softmax(logits, dim=-1) * F.log_softmax(logits, dim=-1)
|
64 |
+
entropy = -entropy.sum(-1)
|
65 |
+
return entropy.mean().item()
|
66 |
+
|
67 |
+
|
68 |
+
def experiment(args):
|
69 |
+
# load model
|
70 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
71 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
72 |
+
scoring_model.eval()
|
73 |
+
# load data
|
74 |
+
data = load_data(args.dataset_file)
|
75 |
+
n_samples = len(data["sampled"])
|
76 |
+
# eval criterions
|
77 |
+
criterion_fns = {'likelihood': get_likelihood,
|
78 |
+
'rank': get_rank,
|
79 |
+
'logrank': get_logrank,
|
80 |
+
'entropy': get_entropy}
|
81 |
+
for name in criterion_fns:
|
82 |
+
criterion_fn = criterion_fns[name]
|
83 |
+
torch.manual_seed(args.seed)
|
84 |
+
np.random.seed(args.seed)
|
85 |
+
eval_results = []
|
86 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
87 |
+
original_text = data["original"][idx]
|
88 |
+
sampled_text = data["sampled"][idx]
|
89 |
+
# original text
|
90 |
+
tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
91 |
+
labels = tokenized.input_ids[:, 1:]
|
92 |
+
with torch.no_grad():
|
93 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
94 |
+
original_crit = criterion_fn(logits, labels)
|
95 |
+
# sampled text
|
96 |
+
tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
97 |
+
labels = tokenized.input_ids[:, 1:]
|
98 |
+
with torch.no_grad():
|
99 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
100 |
+
sampled_crit = criterion_fn(logits, labels)
|
101 |
+
# result
|
102 |
+
eval_results.append({"original": original_text,
|
103 |
+
"original_crit": original_crit,
|
104 |
+
"sampled": sampled_text,
|
105 |
+
"sampled_crit": sampled_crit})
|
106 |
+
|
107 |
+
# compute prediction scores for real/sampled passages
|
108 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
109 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
110 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
111 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
112 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
113 |
+
# log results
|
114 |
+
results_file = f'{args.output_file}.{name}.json'
|
115 |
+
results = { 'name': f'{name}_threshold',
|
116 |
+
'info': {'n_samples': n_samples},
|
117 |
+
'predictions': predictions,
|
118 |
+
'raw_results': eval_results,
|
119 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
120 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
121 |
+
'loss': 1 - pr_auc}
|
122 |
+
with open(results_file, 'w') as fout:
|
123 |
+
json.dump(results, fout)
|
124 |
+
print(f'Results written into {results_file}')
|
125 |
+
|
126 |
+
if __name__ == '__main__':
|
127 |
+
parser = argparse.ArgumentParser()
|
128 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
129 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
130 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
131 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
132 |
+
parser.add_argument('--seed', type=int, default=0)
|
133 |
+
parser.add_argument('--device', type=str, default="cuda")
|
134 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
135 |
+
args = parser.parse_args()
|
136 |
+
|
137 |
+
experiment(args)
|
custom_datasets.py
ADDED
@@ -0,0 +1,96 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import os.path
|
2 |
+
import random
|
3 |
+
import datasets
|
4 |
+
|
5 |
+
SEPARATOR = '<<<SEP>>>'
|
6 |
+
|
7 |
+
|
8 |
+
DATASETS = ['writing', 'english', 'german', 'pubmed']
|
9 |
+
|
10 |
+
def load_dataset(path, name=None, split=None, cache_dir=None):
|
11 |
+
# use local model if it exists
|
12 |
+
local_path = os.path.join(cache_dir, f'local.{path}_{name}_{split}')
|
13 |
+
if os.path.exists(local_path):
|
14 |
+
return datasets.load_from_disk(local_path)
|
15 |
+
return datasets.load_dataset(path, name, split=split, cache_dir=cache_dir)
|
16 |
+
|
17 |
+
def load_pubmed(cache_dir):
|
18 |
+
data = load_dataset('pubmed_qa', 'pqa_labeled', split='train', cache_dir=cache_dir)
|
19 |
+
|
20 |
+
# combine question and long_answer
|
21 |
+
data = [f'Question: {q} Answer:{SEPARATOR}{a}' for q, a in zip(data['question'], data['long_answer'])]
|
22 |
+
|
23 |
+
return data
|
24 |
+
|
25 |
+
|
26 |
+
def process_prompt(prompt):
|
27 |
+
return prompt.replace('[ WP ]', '').replace('[ OT ]', '')
|
28 |
+
|
29 |
+
|
30 |
+
def process_spaces(story):
|
31 |
+
return story.replace(
|
32 |
+
' ,', ',').replace(
|
33 |
+
' .', '.').replace(
|
34 |
+
' ?', '?').replace(
|
35 |
+
' !', '!').replace(
|
36 |
+
' ;', ';').replace(
|
37 |
+
' \'', '\'').replace(
|
38 |
+
' ’ ', '\'').replace(
|
39 |
+
' :', ':').replace(
|
40 |
+
'<newline>', '\n').replace(
|
41 |
+
'`` ', '"').replace(
|
42 |
+
' \'\'', '"').replace(
|
43 |
+
'\'\'', '"').replace(
|
44 |
+
'.. ', '... ').replace(
|
45 |
+
' )', ')').replace(
|
46 |
+
'( ', '(').replace(
|
47 |
+
' n\'t', 'n\'t').replace(
|
48 |
+
' i ', ' I ').replace(
|
49 |
+
' i\'', ' I\'').replace(
|
50 |
+
'\\\'', '\'').replace(
|
51 |
+
'\n ', '\n').strip()
|
52 |
+
|
53 |
+
|
54 |
+
def load_writing(cache_dir=None):
|
55 |
+
writing_path = 'data/writingPrompts'
|
56 |
+
|
57 |
+
with open(f'{writing_path}/valid.wp_source', 'r') as f:
|
58 |
+
prompts = f.readlines()
|
59 |
+
with open(f'{writing_path}/valid.wp_target', 'r') as f:
|
60 |
+
stories = f.readlines()
|
61 |
+
|
62 |
+
prompts = [process_prompt(prompt) for prompt in prompts]
|
63 |
+
joined = [process_spaces(prompt + " " + story) for prompt, story in zip(prompts, stories)]
|
64 |
+
filtered = [story for story in joined if 'nsfw' not in story and 'NSFW' not in story]
|
65 |
+
|
66 |
+
random.seed(0)
|
67 |
+
random.shuffle(filtered)
|
68 |
+
|
69 |
+
return filtered
|
70 |
+
|
71 |
+
|
72 |
+
def load_language(language, cache_dir):
|
73 |
+
# load either the english or german portion of the wmt16 dataset
|
74 |
+
assert language in ['en', 'de']
|
75 |
+
d = load_dataset('wmt16', 'de-en', split='train', cache_dir=cache_dir)
|
76 |
+
docs = d['translation']
|
77 |
+
desired_language_docs = [d[language] for d in docs]
|
78 |
+
lens = [len(d.split()) for d in desired_language_docs]
|
79 |
+
sub = [d for d, l in zip(desired_language_docs, lens) if l > 100 and l < 150]
|
80 |
+
return sub
|
81 |
+
|
82 |
+
|
83 |
+
def load_german(cache_dir):
|
84 |
+
return load_language('de', cache_dir)
|
85 |
+
|
86 |
+
|
87 |
+
def load_english(cache_dir):
|
88 |
+
return load_language('en', cache_dir)
|
89 |
+
|
90 |
+
|
91 |
+
def load(name, cache_dir, **kwargs):
|
92 |
+
if name in DATASETS:
|
93 |
+
load_fn = globals()[f'load_{name}']
|
94 |
+
return load_fn(cache_dir=cache_dir, **kwargs)
|
95 |
+
else:
|
96 |
+
raise ValueError(f'Unknown dataset {name}')
|
data_builder.py
ADDED
@@ -0,0 +1,276 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import time
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
import datasets
|
9 |
+
import torch
|
10 |
+
import random
|
11 |
+
import argparse
|
12 |
+
import os
|
13 |
+
import json
|
14 |
+
import custom_datasets
|
15 |
+
from model import load_tokenizer, load_model
|
16 |
+
|
17 |
+
|
18 |
+
def save_data(output_file, args, data):
|
19 |
+
# write args to file
|
20 |
+
args_file = f"{output_file}.args.json"
|
21 |
+
with open(args_file, "w") as fout:
|
22 |
+
json.dump(args.__dict__, fout, indent=4)
|
23 |
+
print(f"Args written into {args_file}")
|
24 |
+
|
25 |
+
# write the data to a json file in the save folder
|
26 |
+
data_file = f"{output_file}.raw_data.json"
|
27 |
+
with open(data_file, "w") as fout:
|
28 |
+
json.dump(data, fout, indent=4)
|
29 |
+
print(f"Raw data written into {data_file}")
|
30 |
+
|
31 |
+
|
32 |
+
def load_data(input_file):
|
33 |
+
data_file = f"{input_file}.raw_data.json"
|
34 |
+
with open(data_file, "r") as fin:
|
35 |
+
data = json.load(fin)
|
36 |
+
print(f"Raw data loaded from {data_file}")
|
37 |
+
return data
|
38 |
+
|
39 |
+
|
40 |
+
class DataBuilder:
|
41 |
+
def __init__(self, args):
|
42 |
+
self.args = args
|
43 |
+
self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
|
44 |
+
self.base_model = None if args.openai_model else load_model(args.base_model_name, args.device, args.cache_dir)
|
45 |
+
|
46 |
+
def _openai_sample(self, prefix):
|
47 |
+
def _drop_last_word(text):
|
48 |
+
return ' '.join(text.split(' ')[:-1])
|
49 |
+
|
50 |
+
import openai
|
51 |
+
assert self.args.openai_key is not None, "Must provide OpenAI API key as --openai_key"
|
52 |
+
openai.api_key = self.args.openai_key
|
53 |
+
if self.args.openai_base is not None:
|
54 |
+
openai.api_base = self.args.openai_base
|
55 |
+
|
56 |
+
if self.args.dataset != 'pubmed': # keep Answer: prefix for pubmed
|
57 |
+
prefix = _drop_last_word(prefix)
|
58 |
+
|
59 |
+
# sample from the openai model
|
60 |
+
kwargs = {"max_tokens": 200}
|
61 |
+
if self.args.do_top_p:
|
62 |
+
kwargs['top_p'] = self.args.top_p
|
63 |
+
elif self.args.do_top_k:
|
64 |
+
kwargs['top_k'] = self.args.top_k
|
65 |
+
elif self.args.do_temperature:
|
66 |
+
kwargs['temperature'] = self.args.temperature
|
67 |
+
|
68 |
+
if self.args.openai_model == 'davinci':
|
69 |
+
kwargs["engine"] = self.args.openai_model
|
70 |
+
response = openai.Completion.create(prompt=f"{prefix}", **kwargs)
|
71 |
+
return prefix + response['choices'][0]['text']
|
72 |
+
|
73 |
+
elif self.args.openai_model in ['gpt-3.5-turbo', 'gpt-4']:
|
74 |
+
roles = {'xsum': 'You are a News writer.',
|
75 |
+
'writing': 'You are a Fiction writer.',
|
76 |
+
'pubmed': 'You are a Technical writer.'}
|
77 |
+
prompts = {'xsum': 'Please write an article with about 150 words starting exactly with:',
|
78 |
+
'writing': 'Please write an article with about 150 words starting exactly with:',
|
79 |
+
'pubmed': 'Please answer the question in about 50 words.'}
|
80 |
+
messages = [
|
81 |
+
{'role': 'system', 'content': roles[self.args.dataset]},
|
82 |
+
{'role': 'user', 'content': f'{prompts[self.args.dataset]} {prefix}'},
|
83 |
+
]
|
84 |
+
kwargs["model"] = self.args.openai_model
|
85 |
+
kwargs["messages"] = messages
|
86 |
+
response = openai.ChatCompletion.create(**kwargs)
|
87 |
+
response = response['choices'][0]['message']['content']
|
88 |
+
# ChatGPT may repeat the prefix
|
89 |
+
if response.startswith(prefix[:20]):
|
90 |
+
return response
|
91 |
+
return prefix + ' ' + response
|
92 |
+
|
93 |
+
else:
|
94 |
+
raise NotImplementedError
|
95 |
+
|
96 |
+
# sample from base_model using ****only**** the first 30 tokens in each example as context
|
97 |
+
def _sample_from_model(self, texts, min_words=55, prompt_tokens=30):
|
98 |
+
# encode each text as a list of token ids
|
99 |
+
if self.args.dataset == 'pubmed':
|
100 |
+
texts = [t[:t.index(custom_datasets.SEPARATOR)] for t in texts]
|
101 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
|
102 |
+
else:
|
103 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
|
104 |
+
all_encoded = {key: value[:, :prompt_tokens] for key, value in all_encoded.items()}
|
105 |
+
|
106 |
+
if self.args.openai_model:
|
107 |
+
# decode the prefixes back into text
|
108 |
+
prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
|
109 |
+
|
110 |
+
decoded = []
|
111 |
+
for idx, prefix in enumerate(prefixes):
|
112 |
+
while idx >= len(decoded):
|
113 |
+
try:
|
114 |
+
decoded.append(self._openai_sample(prefix))
|
115 |
+
except Exception as ex:
|
116 |
+
print(ex)
|
117 |
+
print('Wait 10 minutes before retry ...')
|
118 |
+
time.sleep(600)
|
119 |
+
|
120 |
+
else:
|
121 |
+
self.base_model.eval()
|
122 |
+
decoded = ['' for _ in range(len(texts))]
|
123 |
+
|
124 |
+
# sample from the model until we get a sample with at least min_words words for each example
|
125 |
+
# this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
|
126 |
+
tries = 0
|
127 |
+
m = 0
|
128 |
+
while m < min_words:
|
129 |
+
if tries != 0:
|
130 |
+
print()
|
131 |
+
print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
|
132 |
+
prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
|
133 |
+
for prefix, x in zip(prefixes, decoded):
|
134 |
+
if len(x.split()) == m:
|
135 |
+
print(prefix, '=>', x)
|
136 |
+
|
137 |
+
sampling_kwargs = {}
|
138 |
+
if self.args.do_top_p:
|
139 |
+
sampling_kwargs['top_p'] = self.args.top_p
|
140 |
+
elif self.args.do_top_k:
|
141 |
+
sampling_kwargs['top_k'] = self.args.top_k
|
142 |
+
elif self.args.do_temperature:
|
143 |
+
sampling_kwargs['temperature'] = self.args.temperature
|
144 |
+
min_length = 50 if self.args.dataset in ['pubmed'] else 150
|
145 |
+
outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
|
146 |
+
**sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
|
147 |
+
eos_token_id=self.base_tokenizer.eos_token_id)
|
148 |
+
decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
149 |
+
m = min(len(x.split()) for x in decoded)
|
150 |
+
tries += 1
|
151 |
+
|
152 |
+
return decoded
|
153 |
+
|
154 |
+
def generate_samples(self, raw_data, batch_size):
|
155 |
+
# trim to shorter length
|
156 |
+
def _trim_to_shorter_length(texta, textb):
|
157 |
+
# truncate to shorter of o and s
|
158 |
+
shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
|
159 |
+
texta = ' '.join(texta.split(' ')[:shorter_length])
|
160 |
+
textb = ' '.join(textb.split(' ')[:shorter_length])
|
161 |
+
return texta, textb
|
162 |
+
|
163 |
+
def _truncate_to_substring(text, substring, idx_occurrence):
|
164 |
+
# truncate everything after the idx_occurrence occurrence of substring
|
165 |
+
assert idx_occurrence > 0, 'idx_occurrence must be > 0'
|
166 |
+
idx = -1
|
167 |
+
for _ in range(idx_occurrence):
|
168 |
+
idx = text.find(substring, idx + 1)
|
169 |
+
if idx == -1:
|
170 |
+
return text
|
171 |
+
return text[:idx]
|
172 |
+
|
173 |
+
data = {
|
174 |
+
"original": [],
|
175 |
+
"sampled": [],
|
176 |
+
}
|
177 |
+
|
178 |
+
for batch in range(len(raw_data) // batch_size):
|
179 |
+
print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
|
180 |
+
original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
|
181 |
+
sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55)
|
182 |
+
|
183 |
+
for o, s in zip(original_text, sampled_text):
|
184 |
+
if self.args.dataset == 'pubmed':
|
185 |
+
s = _truncate_to_substring(s, 'Question:', 2)
|
186 |
+
o = o.replace(custom_datasets.SEPARATOR, ' ')
|
187 |
+
|
188 |
+
o, s = _trim_to_shorter_length(o, s)
|
189 |
+
|
190 |
+
# add to the data
|
191 |
+
data["original"].append(o)
|
192 |
+
data["sampled"].append(s)
|
193 |
+
|
194 |
+
return data
|
195 |
+
|
196 |
+
def generate_data(args, dataset, key):
|
197 |
+
# strip newlines from each example; replace one or more newlines with a single space
|
198 |
+
def _strip_newlines(text):
|
199 |
+
return ' '.join(text.split())
|
200 |
+
|
201 |
+
# load data
|
202 |
+
if dataset in custom_datasets.DATASETS:
|
203 |
+
data = custom_datasets.load(dataset, args.cache_dir)
|
204 |
+
else:
|
205 |
+
data = custom_datasets.load_dataset(dataset, split='train', cache_dir=args.cache_dir)[key]
|
206 |
+
|
207 |
+
# get unique examples, strip whitespace, and remove newlines
|
208 |
+
# then take just the long examples, shuffle, take the first 5,000 to tokenize to save time
|
209 |
+
# then take just the examples that are <= 512 tokens (for the base model)
|
210 |
+
# then generate n_samples samples
|
211 |
+
|
212 |
+
# remove duplicates from the data
|
213 |
+
data = list(dict.fromkeys(data)) # deterministic, as opposed to set()
|
214 |
+
|
215 |
+
# strip whitespace around each example
|
216 |
+
data = [x.strip() for x in data]
|
217 |
+
|
218 |
+
# remove newlines from each example
|
219 |
+
data = [_strip_newlines(x) for x in data]
|
220 |
+
|
221 |
+
# try to keep only examples with > 250 words
|
222 |
+
if dataset in ['writing', 'squad', 'xsum']:
|
223 |
+
long_data = [x for x in data if len(x.split()) > 250]
|
224 |
+
if len(long_data) > 0:
|
225 |
+
data = long_data
|
226 |
+
|
227 |
+
random.shuffle(data)
|
228 |
+
data = data[:5_000]
|
229 |
+
|
230 |
+
# keep only examples with <= 512 tokens according to base_tokenizer
|
231 |
+
# this step has the extra effect of removing examples with low-quality/garbage content
|
232 |
+
data_builder = DataBuilder(args)
|
233 |
+
tokenized_data = data_builder.base_tokenizer(data)
|
234 |
+
data = [x for x, y in zip(data, tokenized_data["input_ids"]) if len(y) <= 512]
|
235 |
+
|
236 |
+
# print stats about remaining data
|
237 |
+
print(f"Total number of samples: {len(data)}")
|
238 |
+
print(f"Average number of words: {np.mean([len(x.split()) for x in data])}")
|
239 |
+
|
240 |
+
return data_builder.generate_samples(data[:args.n_samples], batch_size=args.batch_size)
|
241 |
+
|
242 |
+
if __name__ == '__main__':
|
243 |
+
parser = argparse.ArgumentParser()
|
244 |
+
parser.add_argument('--output_file', type=str, default="./exp_gpt3/data/xsum_gpt2")
|
245 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
246 |
+
parser.add_argument('--n_samples', type=int, default=200)
|
247 |
+
parser.add_argument('--openai_base', type=str, default=None)
|
248 |
+
parser.add_argument('--openai_key', type=str, default=None)
|
249 |
+
parser.add_argument('--openai_model', type=str, default=None) # davinci, gpt-3.5-turbo, gpt-4
|
250 |
+
parser.add_argument('--base_model_name', type=str, default="gpt2")
|
251 |
+
parser.add_argument('--batch_size', type=int, default=50)
|
252 |
+
parser.add_argument('--do_top_k', action='store_true')
|
253 |
+
parser.add_argument('--top_k', type=int, default=40)
|
254 |
+
parser.add_argument('--do_top_p', action='store_true')
|
255 |
+
parser.add_argument('--top_p', type=float, default=0.96)
|
256 |
+
parser.add_argument('--do_temperature', action='store_true')
|
257 |
+
parser.add_argument('--temperature', type=float, default=0.8)
|
258 |
+
parser.add_argument('--seed', type=int, default=0)
|
259 |
+
parser.add_argument('--device', type=str, default="cuda")
|
260 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
261 |
+
args = parser.parse_args()
|
262 |
+
|
263 |
+
os.environ["XDG_CACHE_HOME"] = args.cache_dir
|
264 |
+
if not os.path.exists(args.cache_dir):
|
265 |
+
os.makedirs(args.cache_dir)
|
266 |
+
print(f"Using cache dir {args.cache_dir}")
|
267 |
+
|
268 |
+
random.seed(args.seed)
|
269 |
+
torch.manual_seed(args.seed)
|
270 |
+
np.random.seed(args.seed)
|
271 |
+
|
272 |
+
print(f'Loading dataset {args.dataset}...')
|
273 |
+
dataset_keys = {'xsum': 'document', 'squad': 'context', 'writing': 'document'}
|
274 |
+
data = generate_data(args, args.dataset, dataset_keys[args.dataset] if args.dataset in dataset_keys else None)
|
275 |
+
|
276 |
+
save_data(args.output_file, args, data)
|
data_truncator.py
ADDED
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import time
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
import datasets
|
9 |
+
import torch
|
10 |
+
import random
|
11 |
+
import argparse
|
12 |
+
import os
|
13 |
+
import json
|
14 |
+
import custom_datasets
|
15 |
+
from model import load_tokenizer, load_model
|
16 |
+
|
17 |
+
def stats_str(data):
|
18 |
+
if type(data) == dict:
|
19 |
+
mean_orig = np.mean([len(v.split()) for v in data['original']])
|
20 |
+
mean_samp = np.mean([len(v.split()) for v in data['sampled']])
|
21 |
+
return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled).'
|
22 |
+
else:
|
23 |
+
mean_orig = np.mean([len(v['original'].split()) for v in data])
|
24 |
+
mean_samp = np.mean([len(v['sampled'].split()) for v in data])
|
25 |
+
mean_perturb_orig = np.mean([np.mean([len(p.split()) for p in v['perturbed_original']]) for v in data])
|
26 |
+
mean_perturb_samp = np.mean([np.mean([len(p.split()) for p in v['perturbed_sampled']]) for v in data])
|
27 |
+
return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled), {mean_perturb_orig:.0f} words (perturb original), {mean_perturb_samp:.0f} words (perturb sampled).'
|
28 |
+
|
29 |
+
def save_data(output_file, args, data):
|
30 |
+
# write args to file
|
31 |
+
args_file = f"{output_file}.args.json"
|
32 |
+
with open(args_file, "w") as fout:
|
33 |
+
json.dump(args, fout, indent=4)
|
34 |
+
print(f"Args written into {args_file}")
|
35 |
+
|
36 |
+
# write the data to a json file in the save folder
|
37 |
+
data_file = f"{output_file}.raw_data.json"
|
38 |
+
with open(data_file, "w") as fout:
|
39 |
+
json.dump(data, fout, indent=4)
|
40 |
+
print(f"Raw data written into {data_file}: {stats_str(data)}")
|
41 |
+
|
42 |
+
|
43 |
+
def load_data(input_file):
|
44 |
+
# load args from file
|
45 |
+
args_file = f"{input_file}.args.json"
|
46 |
+
with open(args_file, "r") as fin:
|
47 |
+
args = json.load(fin)
|
48 |
+
print(f"Args loaded from {args_file}")
|
49 |
+
|
50 |
+
# load the data from file
|
51 |
+
data_file = f"{input_file}.raw_data.json"
|
52 |
+
with open(data_file, "r") as fin:
|
53 |
+
data = json.load(fin)
|
54 |
+
print(f"Raw data loaded from {data_file}: {stats_str(data)}")
|
55 |
+
|
56 |
+
return args, data
|
57 |
+
|
58 |
+
def convert_data(input_file, output_file, max_words):
|
59 |
+
def _reduce(text):
|
60 |
+
lines = []
|
61 |
+
nwords = 0
|
62 |
+
for line in text.split('\n'):
|
63 |
+
if nwords >= max_words:
|
64 |
+
break
|
65 |
+
words = line.split()
|
66 |
+
words = words[:max_words - nwords]
|
67 |
+
lines.append(' '.join(words))
|
68 |
+
nwords += len(words)
|
69 |
+
return '\n'.join(lines)
|
70 |
+
|
71 |
+
args, data = load_data(input_file)
|
72 |
+
if type(data) == dict:
|
73 |
+
data['original'] = [_reduce(x) for x in data['original']]
|
74 |
+
data['sampled'] = [_reduce(x) for x in data['sampled']]
|
75 |
+
else:
|
76 |
+
for item in data:
|
77 |
+
item['original'] = _reduce(item['original'])
|
78 |
+
item['sampled'] = _reduce(item['sampled'])
|
79 |
+
item['perturbed_original'] = [_reduce(x) for x in item['perturbed_original']]
|
80 |
+
item['perturbed_sampled'] = [_reduce(x) for x in item['perturbed_sampled']]
|
81 |
+
|
82 |
+
save_data(output_file, args, data)
|
83 |
+
|
84 |
+
if __name__ == '__main__':
|
85 |
+
parser = argparse.ArgumentParser()
|
86 |
+
parser.add_argument('--input_path', type=str, default="./exp_gpt3to4/data/")
|
87 |
+
parser.add_argument('--output_path', type=str, default="./exp_maxlen150/data/")
|
88 |
+
parser.add_argument('--max_words', type=int, default=150)
|
89 |
+
args = parser.parse_args()
|
90 |
+
|
91 |
+
import glob
|
92 |
+
import os.path as path
|
93 |
+
|
94 |
+
for file_name in glob.glob(f'{args.input_path}/*.raw_data.json'):
|
95 |
+
print(file_name)
|
96 |
+
file_name = path.basename(file_name).replace('.raw_data.json', '')
|
97 |
+
convert_data(path.join(args.input_path, file_name), path.join(args.output_path, file_name), args.max_words)
|
detect_gpt.py
ADDED
@@ -0,0 +1,295 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import os.path
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
9 |
+
import re
|
10 |
+
import torch
|
11 |
+
import tqdm
|
12 |
+
import argparse
|
13 |
+
import json
|
14 |
+
from data_builder import load_data, save_data
|
15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
16 |
+
from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
|
17 |
+
|
18 |
+
# define regex to match all <extra_id_*> tokens, where * is an integer
|
19 |
+
pattern = re.compile(r"<extra_id_\d+>")
|
20 |
+
|
21 |
+
def load_mask_model(model_name, device, cache_dir):
|
22 |
+
model_name = get_model_fullname(model_name)
|
23 |
+
# mask filling t5 model
|
24 |
+
print(f'Loading mask filling model {model_name}...')
|
25 |
+
mask_model = from_pretrained(AutoModelForSeq2SeqLM, model_name, {}, cache_dir)
|
26 |
+
mask_model = mask_model.to(device)
|
27 |
+
return mask_model
|
28 |
+
|
29 |
+
def load_mask_tokenizer(model_name, max_length, cache_dir):
|
30 |
+
model_name = get_model_fullname(model_name)
|
31 |
+
tokenizer = from_pretrained(AutoTokenizer, model_name, {'model_max_length': max_length}, cache_dir)
|
32 |
+
return tokenizer
|
33 |
+
|
34 |
+
def tokenize_and_mask(text, span_length, pct, ceil_pct=False):
|
35 |
+
buffer_size = 1
|
36 |
+
tokens = text.split(' ')
|
37 |
+
mask_string = '<<<mask>>>'
|
38 |
+
|
39 |
+
n_spans = pct * len(tokens) / (span_length + buffer_size * 2)
|
40 |
+
if ceil_pct:
|
41 |
+
n_spans = np.ceil(n_spans)
|
42 |
+
n_spans = int(n_spans)
|
43 |
+
|
44 |
+
n_masks = 0
|
45 |
+
while n_masks < n_spans:
|
46 |
+
start = np.random.randint(0, len(tokens) - span_length)
|
47 |
+
end = start + span_length
|
48 |
+
search_start = max(0, start - buffer_size)
|
49 |
+
search_end = min(len(tokens), end + buffer_size)
|
50 |
+
if mask_string not in tokens[search_start:search_end]:
|
51 |
+
tokens[start:end] = [mask_string]
|
52 |
+
n_masks += 1
|
53 |
+
|
54 |
+
# replace each occurrence of mask_string with <extra_id_NUM>, where NUM increments
|
55 |
+
num_filled = 0
|
56 |
+
for idx, token in enumerate(tokens):
|
57 |
+
if token == mask_string:
|
58 |
+
tokens[idx] = f'<extra_id_{num_filled}>'
|
59 |
+
num_filled += 1
|
60 |
+
assert num_filled == n_masks, f"num_filled {num_filled} != n_masks {n_masks}"
|
61 |
+
text = ' '.join(tokens)
|
62 |
+
return text
|
63 |
+
|
64 |
+
def count_masks(texts):
|
65 |
+
return [len([x for x in text.split() if x.startswith("<extra_id_")]) for text in texts]
|
66 |
+
|
67 |
+
# replace each masked span with a sample from T5 mask_model
|
68 |
+
def replace_masks(args, mask_model, mask_tokenizer, texts):
|
69 |
+
n_expected = count_masks(texts)
|
70 |
+
stop_id = mask_tokenizer.encode(f"<extra_id_{max(n_expected)}>")[0]
|
71 |
+
tokens = mask_tokenizer(texts, return_tensors="pt", padding=True).to(args.device)
|
72 |
+
outputs = mask_model.generate(**tokens, max_length=150, do_sample=True, top_p=args.mask_top_p,
|
73 |
+
num_return_sequences=1, eos_token_id=stop_id)
|
74 |
+
return mask_tokenizer.batch_decode(outputs, skip_special_tokens=False)
|
75 |
+
|
76 |
+
def extract_fills(texts):
|
77 |
+
# remove <pad> from beginning of each text
|
78 |
+
texts = [x.replace("<pad>", "").replace("</s>", "").strip() for x in texts]
|
79 |
+
|
80 |
+
# return the text in between each matched mask token
|
81 |
+
extracted_fills = [pattern.split(x)[1:-1] for x in texts]
|
82 |
+
|
83 |
+
# remove whitespace around each fill
|
84 |
+
extracted_fills = [[y.strip() for y in x] for x in extracted_fills]
|
85 |
+
|
86 |
+
return extracted_fills
|
87 |
+
|
88 |
+
def apply_extracted_fills(masked_texts, extracted_fills):
|
89 |
+
# split masked text into tokens, only splitting on spaces (not newlines)
|
90 |
+
tokens = [x.split(' ') for x in masked_texts]
|
91 |
+
|
92 |
+
n_expected = count_masks(masked_texts)
|
93 |
+
|
94 |
+
# replace each mask token with the corresponding fill
|
95 |
+
for idx, (text, fills, n) in enumerate(zip(tokens, extracted_fills, n_expected)):
|
96 |
+
if len(fills) < n:
|
97 |
+
tokens[idx] = []
|
98 |
+
else:
|
99 |
+
for fill_idx in range(n):
|
100 |
+
text[text.index(f"<extra_id_{fill_idx}>")] = fills[fill_idx]
|
101 |
+
|
102 |
+
# join tokens back into text
|
103 |
+
texts = [" ".join(x) for x in tokens]
|
104 |
+
return texts
|
105 |
+
|
106 |
+
def perturb_texts_(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
|
107 |
+
span_length = args.span_length
|
108 |
+
pct = args.pct_words_masked
|
109 |
+
masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for x in texts]
|
110 |
+
raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
|
111 |
+
extracted_fills = extract_fills(raw_fills)
|
112 |
+
perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
|
113 |
+
|
114 |
+
# Handle the fact that sometimes the model doesn't generate the right number of fills and we have to try again
|
115 |
+
attempts = 1
|
116 |
+
while '' in perturbed_texts:
|
117 |
+
idxs = [idx for idx, x in enumerate(perturbed_texts) if x == '']
|
118 |
+
print(f'WARNING: {len(idxs)} texts have no fills. Trying again [attempt {attempts}].')
|
119 |
+
masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for idx, x in enumerate(texts) if idx in idxs]
|
120 |
+
raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
|
121 |
+
extracted_fills = extract_fills(raw_fills)
|
122 |
+
new_perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
|
123 |
+
for idx, x in zip(idxs, new_perturbed_texts):
|
124 |
+
perturbed_texts[idx] = x
|
125 |
+
attempts += 1
|
126 |
+
return perturbed_texts
|
127 |
+
|
128 |
+
def perturb_texts(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
|
129 |
+
chunk_size = 10
|
130 |
+
outputs = []
|
131 |
+
for i in range(0, len(texts), chunk_size):
|
132 |
+
outputs.extend(perturb_texts_(args, mask_model, mask_tokenizer, texts[i:i + chunk_size], ceil_pct=ceil_pct))
|
133 |
+
return outputs
|
134 |
+
|
135 |
+
# Get the log likelihood of each text under the base_model
|
136 |
+
def get_ll(args, scoring_model, scoring_tokenizer, text):
|
137 |
+
with torch.no_grad():
|
138 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
139 |
+
labels = tokenized.input_ids
|
140 |
+
return -scoring_model(**tokenized, labels=labels).loss.item()
|
141 |
+
|
142 |
+
def get_lls(args, scoring_model, scoring_tokenizer, texts):
|
143 |
+
return [get_ll(args, scoring_model, scoring_tokenizer, text) for text in texts]
|
144 |
+
|
145 |
+
|
146 |
+
def generate_perturbs(args):
|
147 |
+
n_perturbations = args.n_perturbations
|
148 |
+
name = f'perturbation_{n_perturbations}'
|
149 |
+
# load model
|
150 |
+
mask_model = load_mask_model(args.mask_filling_model_name, args.device, args.cache_dir)
|
151 |
+
mask_model.eval()
|
152 |
+
try:
|
153 |
+
n_positions = mask_model.config.n_positions
|
154 |
+
except AttributeError:
|
155 |
+
n_positions = 512
|
156 |
+
mask_tokenizer = load_mask_tokenizer(args.mask_filling_model_name, n_positions, args.cache_dir)
|
157 |
+
|
158 |
+
# load data
|
159 |
+
data = load_data(args.dataset_file)
|
160 |
+
n_samples = len(data["sampled"])
|
161 |
+
|
162 |
+
torch.manual_seed(args.seed)
|
163 |
+
np.random.seed(args.seed)
|
164 |
+
|
165 |
+
# generate perturb samples
|
166 |
+
perturbs = []
|
167 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Perturb text"):
|
168 |
+
original_text = data["original"][idx]
|
169 |
+
sampled_text = data["sampled"][idx]
|
170 |
+
# perturb
|
171 |
+
p_sampled_text = perturb_texts(args, mask_model, mask_tokenizer, [sampled_text for _ in range(n_perturbations)])
|
172 |
+
p_original_text = perturb_texts(args, mask_model, mask_tokenizer, [original_text for _ in range(n_perturbations)])
|
173 |
+
assert len(p_sampled_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_sampled_text)}"
|
174 |
+
assert len(p_original_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_original_text)}"
|
175 |
+
# result
|
176 |
+
perturbs.append({
|
177 |
+
"original": original_text,
|
178 |
+
"sampled": sampled_text,
|
179 |
+
"perturbed_sampled": p_sampled_text,
|
180 |
+
"perturbed_original": p_original_text
|
181 |
+
})
|
182 |
+
|
183 |
+
save_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}', args, perturbs)
|
184 |
+
|
185 |
+
|
186 |
+
def experiment(args):
|
187 |
+
n_perturbations = args.n_perturbations
|
188 |
+
name = f'perturbation_{n_perturbations}'
|
189 |
+
perturb_file = f'{args.dataset_file}.{args.mask_filling_model_name}.{name}.raw_data.json'
|
190 |
+
if os.path.exists(perturb_file):
|
191 |
+
print(f'Use existing perturbation file: {perturb_file}')
|
192 |
+
else:
|
193 |
+
generate_perturbs(args)
|
194 |
+
# load model
|
195 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
196 |
+
scoring_model = load_model(args.scoring_model_name, 'cpu', args.cache_dir)
|
197 |
+
scoring_model.eval()
|
198 |
+
scoring_model.to(args.device)
|
199 |
+
# load data
|
200 |
+
data = load_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}')
|
201 |
+
n_samples = len(data)
|
202 |
+
|
203 |
+
torch.manual_seed(args.seed)
|
204 |
+
np.random.seed(args.seed)
|
205 |
+
|
206 |
+
# Evaluate
|
207 |
+
results = data
|
208 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
209 |
+
original_text = results[idx]["original"]
|
210 |
+
sampled_text = results[idx]["sampled"]
|
211 |
+
perturbed_original = results[idx]["perturbed_original"]
|
212 |
+
perturbed_sampled = results[idx]["perturbed_sampled"]
|
213 |
+
# original text
|
214 |
+
original_ll = get_ll(args, scoring_model, scoring_tokenizer, original_text)
|
215 |
+
p_original_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_original)
|
216 |
+
# sampled text
|
217 |
+
sampled_ll = get_ll(args, scoring_model, scoring_tokenizer, sampled_text)
|
218 |
+
p_sampled_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_sampled)
|
219 |
+
# result
|
220 |
+
results[idx]["original_ll"] = original_ll
|
221 |
+
results[idx]["sampled_ll"] = sampled_ll
|
222 |
+
results[idx]["all_perturbed_sampled_ll"] = p_sampled_ll
|
223 |
+
results[idx]["all_perturbed_original_ll"] = p_original_ll
|
224 |
+
results[idx]["perturbed_sampled_ll"] = np.mean(p_sampled_ll)
|
225 |
+
results[idx]["perturbed_original_ll"] = np.mean(p_original_ll)
|
226 |
+
results[idx]["perturbed_sampled_ll_std"] = np.std(p_sampled_ll) if len(p_sampled_ll) > 1 else 1
|
227 |
+
results[idx]["perturbed_original_ll_std"] = np.std(p_original_ll) if len(p_original_ll) > 1 else 1
|
228 |
+
|
229 |
+
# compute diffs with perturbed
|
230 |
+
predictions = {'real': [], 'samples': []}
|
231 |
+
for res in results:
|
232 |
+
if res['perturbed_original_ll_std'] == 0:
|
233 |
+
res['perturbed_original_ll_std'] = 1
|
234 |
+
print("WARNING: std of perturbed original is 0, setting to 1")
|
235 |
+
print(f"Number of unique perturbed original texts: {len(set(res['perturbed_original']))}")
|
236 |
+
print(f"Original text: {res['original']}")
|
237 |
+
if res['perturbed_sampled_ll_std'] == 0:
|
238 |
+
res['perturbed_sampled_ll_std'] = 1
|
239 |
+
print("WARNING: std of perturbed sampled is 0, setting to 1")
|
240 |
+
print(f"Number of unique perturbed sampled texts: {len(set(res['perturbed_sampled']))}")
|
241 |
+
print(f"Sampled text: {res['sampled']}")
|
242 |
+
predictions['real'].append((res['original_ll'] - res['perturbed_original_ll']) / res['perturbed_original_ll_std'])
|
243 |
+
predictions['samples'].append((res['sampled_ll'] - res['perturbed_sampled_ll']) / res['perturbed_sampled_ll_std'])
|
244 |
+
|
245 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
246 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
247 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
248 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
249 |
+
|
250 |
+
# results
|
251 |
+
results_file = f'{args.output_file}.{name}.json'
|
252 |
+
results = {
|
253 |
+
'name': name,
|
254 |
+
'info': {
|
255 |
+
'pct_words_masked': args.pct_words_masked,
|
256 |
+
'span_length': args.span_length,
|
257 |
+
'n_perturbations': args.n_perturbations,
|
258 |
+
'n_samples': n_samples,
|
259 |
+
},
|
260 |
+
'predictions': predictions,
|
261 |
+
'raw_results': results,
|
262 |
+
'metrics': {
|
263 |
+
'roc_auc': roc_auc,
|
264 |
+
'fpr': fpr,
|
265 |
+
'tpr': tpr,
|
266 |
+
},
|
267 |
+
'pr_metrics': {
|
268 |
+
'pr_auc': pr_auc,
|
269 |
+
'precision': p,
|
270 |
+
'recall': r,
|
271 |
+
},
|
272 |
+
'loss': 1 - pr_auc,
|
273 |
+
}
|
274 |
+
with open(results_file, 'w') as fout:
|
275 |
+
json.dump(results, fout)
|
276 |
+
print(f'Results written into {results_file}')
|
277 |
+
|
278 |
+
|
279 |
+
if __name__ == '__main__':
|
280 |
+
parser = argparse.ArgumentParser()
|
281 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
282 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
283 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
284 |
+
parser.add_argument('--pct_words_masked', type=float, default=0.3) # pct masked is actually pct_words_masked * (span_length / (span_length + 2 * buffer_size))
|
285 |
+
parser.add_argument('--mask_top_p', type=float, default=1.0)
|
286 |
+
parser.add_argument('--span_length', type=int, default=2)
|
287 |
+
parser.add_argument('--n_perturbations', type=int, default=10)
|
288 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
289 |
+
parser.add_argument('--mask_filling_model_name', type=str, default="t5-small")
|
290 |
+
parser.add_argument('--seed', type=int, default=0)
|
291 |
+
parser.add_argument('--device', type=str, default="cuda")
|
292 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
293 |
+
args = parser.parse_args()
|
294 |
+
|
295 |
+
experiment(args)
|
detect_llm.py
ADDED
@@ -0,0 +1,128 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
import numpy as np
|
7 |
+
import torch
|
8 |
+
import torch.nn.functional as F
|
9 |
+
import tqdm
|
10 |
+
import argparse
|
11 |
+
import json
|
12 |
+
from model import load_tokenizer, load_model
|
13 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
14 |
+
from data_builder import load_data
|
15 |
+
|
16 |
+
def get_likelihood(logits, labels):
|
17 |
+
assert logits.shape[0] == 1
|
18 |
+
assert labels.shape[0] == 1
|
19 |
+
|
20 |
+
logits = logits.view(-1, logits.shape[-1])
|
21 |
+
labels = labels.view(-1)
|
22 |
+
log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
|
23 |
+
log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
|
24 |
+
return log_likelihood.mean().item()
|
25 |
+
|
26 |
+
def get_logrank(logits, labels):
|
27 |
+
assert logits.shape[0] == 1
|
28 |
+
assert labels.shape[0] == 1
|
29 |
+
|
30 |
+
# get rank of each label token in the model's likelihood ordering
|
31 |
+
matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
|
32 |
+
assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
|
33 |
+
|
34 |
+
ranks, timesteps = matches[:, -1], matches[:, -2]
|
35 |
+
|
36 |
+
# make sure we got exactly one match for each timestep in the sequence
|
37 |
+
assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
|
38 |
+
|
39 |
+
ranks = ranks.float() + 1 # convert to 1-indexed rank
|
40 |
+
ranks = torch.log(ranks)
|
41 |
+
return ranks.mean().item()
|
42 |
+
|
43 |
+
# Log-Likelihood Log-Rank Ratio
|
44 |
+
def get_lrr(args, scoring_model, scoring_tokenizer, text, perturbs):
|
45 |
+
with torch.no_grad():
|
46 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
47 |
+
labels = tokenized.input_ids[:, 1:]
|
48 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
49 |
+
likelihood = get_likelihood(logits, labels)
|
50 |
+
logrank = get_logrank(logits, labels)
|
51 |
+
return - likelihood / logrank
|
52 |
+
|
53 |
+
# Normalized Log-Rank Perturbation
|
54 |
+
def get_npr(args, scoring_model, scoring_tokenizer, text, perturbs):
|
55 |
+
with torch.no_grad():
|
56 |
+
tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
57 |
+
labels = tokenized.input_ids[:, 1:]
|
58 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
59 |
+
logrank = get_logrank(logits, labels)
|
60 |
+
# perturbations
|
61 |
+
logranks = []
|
62 |
+
for perturb in perturbs:
|
63 |
+
tokenized = scoring_tokenizer(perturb, return_tensors="pt", return_token_type_ids=False).to(args.device)
|
64 |
+
labels = tokenized.input_ids[:, 1:]
|
65 |
+
logits = scoring_model(**tokenized).logits[:, :-1]
|
66 |
+
logranks.append(get_logrank(logits, labels))
|
67 |
+
# npr
|
68 |
+
return np.mean(logranks) / logrank
|
69 |
+
|
70 |
+
def experiment(args):
|
71 |
+
# load model
|
72 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
73 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
74 |
+
scoring_model.eval()
|
75 |
+
# load data
|
76 |
+
data = load_data(args.dataset_file)
|
77 |
+
n_samples = len(data)
|
78 |
+
# eval criterions
|
79 |
+
criterion_fns = {'lrr': get_lrr, 'npr': get_npr}
|
80 |
+
for name in criterion_fns:
|
81 |
+
criterion_fn = criterion_fns[name]
|
82 |
+
torch.manual_seed(args.seed)
|
83 |
+
np.random.seed(args.seed)
|
84 |
+
eval_results = []
|
85 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
86 |
+
original_text = data[idx]["original"]
|
87 |
+
sampled_text = data[idx]["sampled"]
|
88 |
+
perturbed_original = data[idx]["perturbed_original"]
|
89 |
+
perturbed_sampled = data[idx]["perturbed_sampled"]
|
90 |
+
original_crit = criterion_fn(args, scoring_model, scoring_tokenizer, original_text, perturbed_original)
|
91 |
+
sampled_crit = criterion_fn(args, scoring_model, scoring_tokenizer, sampled_text, perturbed_sampled)
|
92 |
+
# result
|
93 |
+
eval_results.append({"original": original_text,
|
94 |
+
"original_crit": original_crit,
|
95 |
+
"sampled": sampled_text,
|
96 |
+
"sampled_crit": sampled_crit})
|
97 |
+
|
98 |
+
# compute prediction scores for real/sampled passages
|
99 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
100 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
101 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
102 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
103 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
104 |
+
# log results
|
105 |
+
results_file = f'{args.output_file}.{name}.json'
|
106 |
+
results = { 'name': f'{name}_threshold',
|
107 |
+
'info': {'n_samples': n_samples},
|
108 |
+
'predictions': predictions,
|
109 |
+
'raw_results': eval_results,
|
110 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
111 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
112 |
+
'loss': 1 - pr_auc}
|
113 |
+
with open(results_file, 'w') as fout:
|
114 |
+
json.dump(results, fout)
|
115 |
+
print(f'Results written into {results_file}')
|
116 |
+
|
117 |
+
if __name__ == '__main__':
|
118 |
+
parser = argparse.ArgumentParser()
|
119 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
120 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
121 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/results/xsum_gpt2.perturbation_10")
|
122 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
123 |
+
parser.add_argument('--seed', type=int, default=0)
|
124 |
+
parser.add_argument('--device', type=str, default="cuda")
|
125 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
126 |
+
args = parser.parse_args()
|
127 |
+
|
128 |
+
experiment(args)
|
detector.py
ADDED
@@ -0,0 +1,11 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
class Detector:
|
2 |
+
def __init__(self):
|
3 |
+
# Model veya gerekli dosyaları yüklemek için burada yapılandırma yapabilirsiniz
|
4 |
+
print("Fast-DetectGPT initialized!")
|
5 |
+
|
6 |
+
def detect(self, text):
|
7 |
+
"""
|
8 |
+
Verilen metni analiz eder ve sonuç döndürür.
|
9 |
+
"""
|
10 |
+
# Gerçek analiz işlemi yerine örnek sonuç döndürülüyor
|
11 |
+
return [(text, 0.85)] # 0.85 AI tarafından üretilmiş olasılığıdır
|
dna_gpt.py
ADDED
@@ -0,0 +1,211 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import os.path
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
|
9 |
+
import re
|
10 |
+
import torch
|
11 |
+
import tqdm
|
12 |
+
import argparse
|
13 |
+
import json
|
14 |
+
from data_builder import load_data, save_data
|
15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
16 |
+
from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
|
17 |
+
from data_builder import load_data
|
18 |
+
from model import load_tokenizer, load_model
|
19 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
20 |
+
import custom_datasets
|
21 |
+
|
22 |
+
class PrefixSampler:
|
23 |
+
def __init__(self, args):
|
24 |
+
self.args = args
|
25 |
+
self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
|
26 |
+
self.base_model = load_model(args.base_model_name, args.device, args.cache_dir)
|
27 |
+
|
28 |
+
def _sample_from_model(self, texts, min_words=55, truncate_ratio=0.5):
|
29 |
+
# encode each text as a list of token ids
|
30 |
+
if self.args.dataset == 'pubmed':
|
31 |
+
pubmed_sep = ' Answer:'
|
32 |
+
texts = [t[:t.index(pubmed_sep) + len(pubmed_sep)] for t in texts]
|
33 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
|
34 |
+
else:
|
35 |
+
texts = [t.split(' ') for t in texts]
|
36 |
+
texts = [' '.join(t[: int(len(t) * truncate_ratio)]) for t in texts]
|
37 |
+
all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
|
38 |
+
|
39 |
+
self.base_model.eval()
|
40 |
+
decoded = ['' for _ in range(len(texts))]
|
41 |
+
|
42 |
+
# sample from the model until we get a sample with at least min_words words for each example
|
43 |
+
# this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
|
44 |
+
tries = 0
|
45 |
+
m = 0
|
46 |
+
while m < min_words:
|
47 |
+
if tries != 0:
|
48 |
+
print()
|
49 |
+
print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
|
50 |
+
|
51 |
+
sampling_kwargs = {'temperature': self.args.temperature}
|
52 |
+
if self.args.do_top_p:
|
53 |
+
sampling_kwargs['top_p'] = self.args.top_p
|
54 |
+
elif self.args.do_top_k:
|
55 |
+
sampling_kwargs['top_k'] = self.args.top_k
|
56 |
+
min_length = 50 if self.args.dataset in ['pubmed'] else 150
|
57 |
+
outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
|
58 |
+
**sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
|
59 |
+
eos_token_id=self.base_tokenizer.eos_token_id)
|
60 |
+
decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
|
61 |
+
m = min(len(x.split()) for x in decoded)
|
62 |
+
tries += 1
|
63 |
+
|
64 |
+
return decoded
|
65 |
+
|
66 |
+
def generate_samples(self, raw_data, batch_size):
|
67 |
+
# trim to shorter length
|
68 |
+
def _trim_to_shorter_length(texta, textb):
|
69 |
+
# truncate to shorter of o and s
|
70 |
+
shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
|
71 |
+
texta = ' '.join(texta.split(' ')[:shorter_length])
|
72 |
+
textb = ' '.join(textb.split(' ')[:shorter_length])
|
73 |
+
return texta, textb
|
74 |
+
|
75 |
+
def _truncate_to_substring(text, substring, idx_occurrence):
|
76 |
+
# truncate everything after the idx_occurrence occurrence of substring
|
77 |
+
assert idx_occurrence > 0, 'idx_occurrence must be > 0'
|
78 |
+
idx = -1
|
79 |
+
for _ in range(idx_occurrence):
|
80 |
+
idx = text.find(substring, idx + 1)
|
81 |
+
if idx == -1:
|
82 |
+
return text
|
83 |
+
return text[:idx]
|
84 |
+
|
85 |
+
data = {
|
86 |
+
"original": [],
|
87 |
+
"sampled": [],
|
88 |
+
}
|
89 |
+
|
90 |
+
assert len(raw_data) % batch_size == 0
|
91 |
+
for batch in range(len(raw_data) // batch_size):
|
92 |
+
print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
|
93 |
+
original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
|
94 |
+
sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55, truncate_ratio=self.args.truncate_ratio)
|
95 |
+
|
96 |
+
for o, s in zip(original_text, sampled_text):
|
97 |
+
if self.args.dataset == 'pubmed':
|
98 |
+
s = _truncate_to_substring(s, 'Question:', 2)
|
99 |
+
o = o.replace(custom_datasets.SEPARATOR, ' ')
|
100 |
+
|
101 |
+
o, s = _trim_to_shorter_length(o, s)
|
102 |
+
|
103 |
+
# add to the data
|
104 |
+
data["original"].append(o)
|
105 |
+
data["sampled"].append(s)
|
106 |
+
|
107 |
+
return data
|
108 |
+
|
109 |
+
def get_likelihood(logits, labels, pad_index):
|
110 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
|
111 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
112 |
+
log_likelihood = lprobs.gather(dim=-1, index=labels)
|
113 |
+
mask = labels != pad_index
|
114 |
+
log_likelihood = (log_likelihood * mask).sum(dim=1) / mask.sum(dim=1)
|
115 |
+
return log_likelihood.squeeze(-1)
|
116 |
+
|
117 |
+
def get_log_prob(sampler, text):
|
118 |
+
tokenized = sampler.base_tokenizer(text, return_tensors="pt", padding=True).to(sampler.args.device)
|
119 |
+
labels = tokenized.input_ids[:, 1:]
|
120 |
+
with torch.no_grad():
|
121 |
+
logits_score = sampler.base_model(**tokenized).logits[:, :-1]
|
122 |
+
return get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
|
123 |
+
|
124 |
+
def get_log_probs(sampler, texts):
|
125 |
+
batch_size = sampler.args.batch_size
|
126 |
+
batch_lprobs = []
|
127 |
+
for batch in range(len(texts) // batch_size):
|
128 |
+
tokenized = sampler.base_tokenizer(texts[batch * batch_size:(batch + 1) * batch_size], return_tensors="pt", padding=True).to(sampler.args.device)
|
129 |
+
labels = tokenized.input_ids[:, 1:]
|
130 |
+
with torch.no_grad():
|
131 |
+
logits_score = sampler.base_model(**tokenized).logits[:, :-1]
|
132 |
+
lprobs = get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
|
133 |
+
batch_lprobs.append(lprobs)
|
134 |
+
return torch.cat(batch_lprobs, dim=0)
|
135 |
+
|
136 |
+
def get_regen_samples(sampler, text):
|
137 |
+
data = [text] * sampler.args.regen_number
|
138 |
+
data = sampler.generate_samples(data, batch_size=sampler.args.batch_size)
|
139 |
+
return data['sampled']
|
140 |
+
|
141 |
+
def get_dna_gpt(sampler, text):
|
142 |
+
lprob = get_log_prob(sampler, text)
|
143 |
+
regens = get_regen_samples(sampler, text)
|
144 |
+
lprob_regens = get_log_probs(sampler, regens)
|
145 |
+
wscore = lprob[0] - lprob_regens.mean()
|
146 |
+
return wscore.item()
|
147 |
+
|
148 |
+
def experiment(args):
|
149 |
+
sampler = PrefixSampler(args)
|
150 |
+
# load data
|
151 |
+
data = load_data(args.dataset_file)
|
152 |
+
n_samples = len(data["sampled"])
|
153 |
+
# evaluate criterion
|
154 |
+
name = "dna_gpt"
|
155 |
+
criterion_fn = get_dna_gpt
|
156 |
+
|
157 |
+
torch.manual_seed(args.seed)
|
158 |
+
np.random.seed(args.seed)
|
159 |
+
results = []
|
160 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
161 |
+
original_text = data["original"][idx]
|
162 |
+
sampled_text = data["sampled"][idx]
|
163 |
+
# original text
|
164 |
+
original_crit = criterion_fn(sampler, original_text)
|
165 |
+
# sampled text
|
166 |
+
sampled_crit = criterion_fn(sampler, sampled_text)
|
167 |
+
# result
|
168 |
+
results.append({"original": original_text,
|
169 |
+
"original_crit": original_crit,
|
170 |
+
"sampled": sampled_text,
|
171 |
+
"sampled_crit": sampled_crit})
|
172 |
+
|
173 |
+
# compute prediction scores for real/sampled passages
|
174 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
175 |
+
'samples': [x["sampled_crit"] for x in results]}
|
176 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
177 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
178 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
179 |
+
# results
|
180 |
+
results_file = f'{args.output_file}.{name}.json'
|
181 |
+
results = { 'name': f'{name}_threshold',
|
182 |
+
'info': {'n_samples': n_samples},
|
183 |
+
'predictions': predictions,
|
184 |
+
'raw_results': results,
|
185 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
186 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
187 |
+
'loss': 1 - pr_auc}
|
188 |
+
with open(results_file, 'w') as fout:
|
189 |
+
json.dump(results, fout)
|
190 |
+
print(f'Results written into {results_file}')
|
191 |
+
|
192 |
+
if __name__ == '__main__':
|
193 |
+
parser = argparse.ArgumentParser()
|
194 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/pubmed_davinci")
|
195 |
+
parser.add_argument('--dataset', type=str, default="pubmed")
|
196 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/pubmed_davinci")
|
197 |
+
parser.add_argument('--truncate_ratio', type=float, default=0.5)
|
198 |
+
parser.add_argument('--regen_number', type=int, default=10)
|
199 |
+
parser.add_argument('--base_model_name', type=str, default="gpt2")
|
200 |
+
parser.add_argument('--batch_size', type=int, default=10)
|
201 |
+
parser.add_argument('--do_top_k', action='store_true')
|
202 |
+
parser.add_argument('--top_k', type=int, default=40)
|
203 |
+
parser.add_argument('--do_top_p', action='store_true')
|
204 |
+
parser.add_argument('--top_p', type=float, default=0.96)
|
205 |
+
parser.add_argument('--temperature', type=float, default=1.0)
|
206 |
+
parser.add_argument('--seed', type=int, default=0)
|
207 |
+
parser.add_argument('--device', type=str, default="cuda")
|
208 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
209 |
+
args = parser.parse_args()
|
210 |
+
|
211 |
+
experiment(args)
|
fast_detect_gpt.py
ADDED
@@ -0,0 +1,162 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import random
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
import torch
|
9 |
+
import torch.nn.functional as F
|
10 |
+
import tqdm
|
11 |
+
import argparse
|
12 |
+
import json
|
13 |
+
from data_builder import load_data
|
14 |
+
from model import load_tokenizer, load_model
|
15 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
16 |
+
|
17 |
+
def get_samples(logits, labels):
|
18 |
+
assert logits.shape[0] == 1
|
19 |
+
assert labels.shape[0] == 1
|
20 |
+
nsamples = 10000
|
21 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
22 |
+
distrib = torch.distributions.categorical.Categorical(logits=lprobs)
|
23 |
+
samples = distrib.sample([nsamples]).permute([1, 2, 0])
|
24 |
+
return samples
|
25 |
+
|
26 |
+
def get_likelihood(logits, labels):
|
27 |
+
assert logits.shape[0] == 1
|
28 |
+
assert labels.shape[0] == 1
|
29 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
|
30 |
+
lprobs = torch.log_softmax(logits, dim=-1)
|
31 |
+
log_likelihood = lprobs.gather(dim=-1, index=labels)
|
32 |
+
return log_likelihood.mean(dim=1)
|
33 |
+
|
34 |
+
def get_sampling_discrepancy(logits_ref, logits_score, labels):
|
35 |
+
assert logits_ref.shape[0] == 1
|
36 |
+
assert logits_score.shape[0] == 1
|
37 |
+
assert labels.shape[0] == 1
|
38 |
+
if logits_ref.size(-1) != logits_score.size(-1):
|
39 |
+
# print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
|
40 |
+
vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
|
41 |
+
logits_ref = logits_ref[:, :, :vocab_size]
|
42 |
+
logits_score = logits_score[:, :, :vocab_size]
|
43 |
+
|
44 |
+
samples = get_samples(logits_ref, labels)
|
45 |
+
log_likelihood_x = get_likelihood(logits_score, labels)
|
46 |
+
log_likelihood_x_tilde = get_likelihood(logits_score, samples)
|
47 |
+
miu_tilde = log_likelihood_x_tilde.mean(dim=-1)
|
48 |
+
sigma_tilde = log_likelihood_x_tilde.std(dim=-1)
|
49 |
+
discrepancy = (log_likelihood_x.squeeze(-1) - miu_tilde) / sigma_tilde
|
50 |
+
return discrepancy.item()
|
51 |
+
|
52 |
+
def get_sampling_discrepancy_analytic(logits_ref, logits_score, labels):
|
53 |
+
assert logits_ref.shape[0] == 1
|
54 |
+
assert logits_score.shape[0] == 1
|
55 |
+
assert labels.shape[0] == 1
|
56 |
+
if logits_ref.size(-1) != logits_score.size(-1):
|
57 |
+
# print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
|
58 |
+
vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
|
59 |
+
logits_ref = logits_ref[:, :, :vocab_size]
|
60 |
+
logits_score = logits_score[:, :, :vocab_size]
|
61 |
+
|
62 |
+
labels = labels.unsqueeze(-1) if labels.ndim == logits_score.ndim - 1 else labels
|
63 |
+
lprobs_score = torch.log_softmax(logits_score, dim=-1)
|
64 |
+
probs_ref = torch.softmax(logits_ref, dim=-1)
|
65 |
+
log_likelihood = lprobs_score.gather(dim=-1, index=labels).squeeze(-1)
|
66 |
+
mean_ref = (probs_ref * lprobs_score).sum(dim=-1)
|
67 |
+
var_ref = (probs_ref * torch.square(lprobs_score)).sum(dim=-1) - torch.square(mean_ref)
|
68 |
+
discrepancy = (log_likelihood.sum(dim=-1) - mean_ref.sum(dim=-1)) / var_ref.sum(dim=-1).sqrt()
|
69 |
+
discrepancy = discrepancy.mean()
|
70 |
+
return discrepancy.item()
|
71 |
+
|
72 |
+
def experiment(args):
|
73 |
+
# load model
|
74 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
75 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
76 |
+
scoring_model.eval()
|
77 |
+
if args.reference_model_name != args.scoring_model_name:
|
78 |
+
reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
|
79 |
+
reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
|
80 |
+
reference_model.eval()
|
81 |
+
# load data
|
82 |
+
data = load_data(args.dataset_file)
|
83 |
+
n_samples = len(data["sampled"])
|
84 |
+
# evaluate criterion
|
85 |
+
if args.discrepancy_analytic:
|
86 |
+
name = "sampling_discrepancy_analytic"
|
87 |
+
criterion_fn = get_sampling_discrepancy_analytic
|
88 |
+
else:
|
89 |
+
name = "sampling_discrepancy"
|
90 |
+
criterion_fn = get_sampling_discrepancy
|
91 |
+
|
92 |
+
random.seed(args.seed)
|
93 |
+
torch.manual_seed(args.seed)
|
94 |
+
np.random.seed(args.seed)
|
95 |
+
results = []
|
96 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
97 |
+
original_text = data["original"][idx]
|
98 |
+
sampled_text = data["sampled"][idx]
|
99 |
+
# original text
|
100 |
+
tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
101 |
+
labels = tokenized.input_ids[:, 1:]
|
102 |
+
with torch.no_grad():
|
103 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
104 |
+
if args.reference_model_name == args.scoring_model_name:
|
105 |
+
logits_ref = logits_score
|
106 |
+
else:
|
107 |
+
tokenized = reference_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
108 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
109 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
110 |
+
original_crit = criterion_fn(logits_ref, logits_score, labels)
|
111 |
+
# sampled text
|
112 |
+
tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
113 |
+
labels = tokenized.input_ids[:, 1:]
|
114 |
+
with torch.no_grad():
|
115 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
116 |
+
if args.reference_model_name == args.scoring_model_name:
|
117 |
+
logits_ref = logits_score
|
118 |
+
else:
|
119 |
+
tokenized = reference_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
120 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
121 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
122 |
+
sampled_crit = criterion_fn(logits_ref, logits_score, labels)
|
123 |
+
# result
|
124 |
+
results.append({"original": original_text,
|
125 |
+
"original_crit": original_crit,
|
126 |
+
"sampled": sampled_text,
|
127 |
+
"sampled_crit": sampled_crit})
|
128 |
+
|
129 |
+
# compute prediction scores for real/sampled passages
|
130 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
131 |
+
'samples': [x["sampled_crit"] for x in results]}
|
132 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
133 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
134 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
135 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
136 |
+
# results
|
137 |
+
results_file = f'{args.output_file}.{name}.json'
|
138 |
+
results = { 'name': f'{name}_threshold',
|
139 |
+
'info': {'n_samples': n_samples},
|
140 |
+
'predictions': predictions,
|
141 |
+
'raw_results': results,
|
142 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
143 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
144 |
+
'loss': 1 - pr_auc}
|
145 |
+
with open(results_file, 'w') as fout:
|
146 |
+
json.dump(results, fout)
|
147 |
+
print(f'Results written into {results_file}')
|
148 |
+
|
149 |
+
if __name__ == '__main__':
|
150 |
+
parser = argparse.ArgumentParser()
|
151 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
152 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
153 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
154 |
+
parser.add_argument('--reference_model_name', type=str, default="gpt2")
|
155 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt2")
|
156 |
+
parser.add_argument('--discrepancy_analytic', action='store_true')
|
157 |
+
parser.add_argument('--seed', type=int, default=0)
|
158 |
+
parser.add_argument('--device', type=str, default="cuda")
|
159 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
160 |
+
args = parser.parse_args()
|
161 |
+
|
162 |
+
experiment(args)
|
gpt3to4.sh
ADDED
@@ -0,0 +1,116 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_gpt3to4
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum writing pubmed"
|
18 |
+
source_models="davinci gpt-3.5-turbo gpt-4"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
openai_base="https://api.openai.com/v1"
|
22 |
+
openai_key="xxxxxxxx" # replace with your own key for generating your own test set
|
23 |
+
|
24 |
+
# We follow DetectGPT settings for generating text from GPT-3
|
25 |
+
M=davinci
|
26 |
+
for D in $datasets; do
|
27 |
+
echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
|
28 |
+
python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
|
29 |
+
--dataset $D --n_samples 150 --do_top_p --top_p 0.9 --batch_size 1 \
|
30 |
+
--output_file $data_path/${D}_${M}
|
31 |
+
done
|
32 |
+
|
33 |
+
# We use a temperature of 0.8 for creativity writing
|
34 |
+
for M in gpt-3.5-turbo gpt-4; do
|
35 |
+
for D in $datasets; do
|
36 |
+
echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
|
37 |
+
python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
|
38 |
+
--dataset $D --n_samples 150 --do_temperature --temperature 0.8 --batch_size 1 \
|
39 |
+
--output_file $data_path/${D}_${M}
|
40 |
+
done
|
41 |
+
done
|
42 |
+
|
43 |
+
# evaluate Fast-DetectGPT in the black-box setting
|
44 |
+
settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
|
45 |
+
for M in $source_models; do
|
46 |
+
for D in $datasets; do
|
47 |
+
for S in $settings; do
|
48 |
+
IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
|
49 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
50 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
|
51 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
52 |
+
done
|
53 |
+
done
|
54 |
+
done
|
55 |
+
|
56 |
+
# evaluate supervised detectors
|
57 |
+
supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
|
58 |
+
for M in $source_models; do
|
59 |
+
for D in $datasets; do
|
60 |
+
for SM in $supervised_models; do
|
61 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
62 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
63 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
64 |
+
done
|
65 |
+
done
|
66 |
+
done
|
67 |
+
|
68 |
+
# evaluate baselines
|
69 |
+
scoring_models="gpt-neo-2.7B"
|
70 |
+
for M in $source_models; do
|
71 |
+
for D in $datasets; do
|
72 |
+
for M2 in $scoring_models; do
|
73 |
+
echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
|
74 |
+
python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
|
75 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
76 |
+
done
|
77 |
+
done
|
78 |
+
done
|
79 |
+
|
80 |
+
# evaluate DNA-GPT
|
81 |
+
scoring_models="gpt-neo-2.7B"
|
82 |
+
for M in $source_models; do
|
83 |
+
for D in $datasets; do
|
84 |
+
for M2 in $scoring_models; do
|
85 |
+
echo `date`, Evaluating DNA-GPT on ${D}_${M}.${M2} ...
|
86 |
+
python scripts/dna_gpt.py --base_model_name ${M2} --dataset $D \
|
87 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
|
88 |
+
done
|
89 |
+
done
|
90 |
+
done
|
91 |
+
|
92 |
+
# evaluate DetectGPT and DetectLLM
|
93 |
+
scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
|
94 |
+
for M in $source_models; do
|
95 |
+
for D in $datasets; do
|
96 |
+
M1=t5-11b # perturbation model
|
97 |
+
for M2 in $scoring_models; do
|
98 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
99 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
100 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
101 |
+
# we leverage DetectGPT to generate the perturbations
|
102 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
103 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
104 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
105 |
+
done
|
106 |
+
done
|
107 |
+
done
|
108 |
+
|
109 |
+
# evaluate GPTZero
|
110 |
+
for M in $source_models; do
|
111 |
+
for D in $datasets; do
|
112 |
+
echo `date`, Evaluating GPTZero on ${D}_${M} ...
|
113 |
+
python scripts/gptzero.py --dataset $D \
|
114 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
115 |
+
done
|
116 |
+
done
|
gptzero.py
ADDED
@@ -0,0 +1,84 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import time
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
import tqdm
|
9 |
+
import argparse
|
10 |
+
import json
|
11 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
12 |
+
from data_builder import load_data
|
13 |
+
|
14 |
+
def detect_gptzero(args, text):
|
15 |
+
import requests
|
16 |
+
url = "https://api.gptzero.me/v2/predict/text"
|
17 |
+
payload = {
|
18 |
+
"document": text,
|
19 |
+
"version": "2023-09-14"
|
20 |
+
}
|
21 |
+
headers = {
|
22 |
+
"Accept": "application/json",
|
23 |
+
"content-type": "application/json",
|
24 |
+
"x-api-key": ""
|
25 |
+
}
|
26 |
+
|
27 |
+
while True:
|
28 |
+
try:
|
29 |
+
time.sleep(600) # 1 request per 10 minutes for free access
|
30 |
+
response = requests.post(url, json=payload, headers=headers)
|
31 |
+
return response.json()['documents'][0]['completely_generated_prob']
|
32 |
+
except Exception as ex:
|
33 |
+
print(ex)
|
34 |
+
|
35 |
+
def experiment(args):
|
36 |
+
# load data
|
37 |
+
data = load_data(args.dataset_file)
|
38 |
+
n_samples = len(data["sampled"])
|
39 |
+
# evaluate criterion
|
40 |
+
name = "gptzero"
|
41 |
+
criterion_fn = detect_gptzero
|
42 |
+
|
43 |
+
results = []
|
44 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
45 |
+
original_text = data["original"][idx]
|
46 |
+
sampled_text = data["sampled"][idx]
|
47 |
+
original_crit = criterion_fn(args, original_text)
|
48 |
+
sampled_crit = criterion_fn(args, sampled_text)
|
49 |
+
# result
|
50 |
+
results.append({"original": original_text,
|
51 |
+
"original_crit": original_crit,
|
52 |
+
"sampled": sampled_text,
|
53 |
+
"sampled_crit": sampled_crit})
|
54 |
+
|
55 |
+
# compute prediction scores for real/sampled passages
|
56 |
+
predictions = {'real': [x["original_crit"] for x in results],
|
57 |
+
'samples': [x["sampled_crit"] for x in results]}
|
58 |
+
print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
|
59 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
60 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
61 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
62 |
+
|
63 |
+
# results
|
64 |
+
results_file = f'{args.output_file}.{name}.json'
|
65 |
+
results = { 'name': f'{name}_threshold',
|
66 |
+
'info': {'n_samples': n_samples},
|
67 |
+
'predictions': predictions,
|
68 |
+
'raw_results': results,
|
69 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
70 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
71 |
+
'loss': 1 - pr_auc}
|
72 |
+
with open(results_file, 'w') as fout:
|
73 |
+
json.dump(results, fout)
|
74 |
+
print(f'Results written into {results_file}')
|
75 |
+
|
76 |
+
if __name__ == '__main__':
|
77 |
+
parser = argparse.ArgumentParser()
|
78 |
+
parser.add_argument('--output_file', type=str, default="./exp_gpt3to4/results/xsum_gpt-4")
|
79 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
80 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_gpt3to4/data/xsum_gpt-4")
|
81 |
+
args = parser.parse_args()
|
82 |
+
|
83 |
+
experiment(args)
|
84 |
+
|
index.html
ADDED
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
<!DOCTYPE html>
|
2 |
+
<html lang="en">
|
3 |
+
<head>
|
4 |
+
<meta charset="UTF-8">
|
5 |
+
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
6 |
+
<title>Fast-DetectGPT</title>
|
7 |
+
<style>
|
8 |
+
body {
|
9 |
+
font-family: Arial, sans-serif;
|
10 |
+
margin: 20px;
|
11 |
+
background-color: #f9f9f9;
|
12 |
+
}
|
13 |
+
.container {
|
14 |
+
max-width: 700px;
|
15 |
+
margin: auto;
|
16 |
+
background: #ffffff;
|
17 |
+
border-radius: 8px;
|
18 |
+
padding: 20px;
|
19 |
+
box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
|
20 |
+
}
|
21 |
+
h1 {
|
22 |
+
text-align: center;
|
23 |
+
color: #333;
|
24 |
+
}
|
25 |
+
textarea {
|
26 |
+
width: 100%;
|
27 |
+
height: 150px;
|
28 |
+
margin: 15px 0;
|
29 |
+
padding: 10px;
|
30 |
+
border: 1px solid #ccc;
|
31 |
+
border-radius: 5px;
|
32 |
+
font-size: 16px;
|
33 |
+
}
|
34 |
+
button {
|
35 |
+
display: block;
|
36 |
+
width: 100%;
|
37 |
+
padding: 10px;
|
38 |
+
background-color: #007bff;
|
39 |
+
color: white;
|
40 |
+
border: none;
|
41 |
+
border-radius: 5px;
|
42 |
+
font-size: 16px;
|
43 |
+
cursor: pointer;
|
44 |
+
}
|
45 |
+
button:hover {
|
46 |
+
background-color: #0056b3;
|
47 |
+
}
|
48 |
+
#result {
|
49 |
+
margin-top: 20px;
|
50 |
+
padding: 15px;
|
51 |
+
background-color: #f1f1f1;
|
52 |
+
border: 1px solid #ddd;
|
53 |
+
border-radius: 5px;
|
54 |
+
}
|
55 |
+
.error {
|
56 |
+
color: red;
|
57 |
+
}
|
58 |
+
</style>
|
59 |
+
</head>
|
60 |
+
<body>
|
61 |
+
<div class="container">
|
62 |
+
<h1>Fast-DetectGPT</h1>
|
63 |
+
<form id="analyzeForm">
|
64 |
+
<textarea name="text" placeholder="Enter your text here..." required></textarea>
|
65 |
+
<button type="submit">Analyze</button>
|
66 |
+
</form>
|
67 |
+
<div id="result"></div>
|
68 |
+
</div>
|
69 |
+
|
70 |
+
<script>
|
71 |
+
document.getElementById('analyzeForm').addEventListener('submit', function (e) {
|
72 |
+
e.preventDefault(); // Formun varsayılan davranışını durdurur.
|
73 |
+
const formData = new FormData(this);
|
74 |
+
const resultDiv = document.getElementById('result');
|
75 |
+
|
76 |
+
// Önce sonucu temizle
|
77 |
+
resultDiv.textContent = '';
|
78 |
+
|
79 |
+
// POST isteği gönder
|
80 |
+
fetch('/analyze', {
|
81 |
+
method: 'POST',
|
82 |
+
headers: {
|
83 |
+
'Content-Type': 'application/json',
|
84 |
+
},
|
85 |
+
body: JSON.stringify({
|
86 |
+
text: formData.get('text'),
|
87 |
+
}),
|
88 |
+
})
|
89 |
+
.then(response => response.json())
|
90 |
+
.then(data => {
|
91 |
+
if (data.error) {
|
92 |
+
resultDiv.innerHTML = `<p class="error">Error: ${data.error}</p>`;
|
93 |
+
} else {
|
94 |
+
resultDiv.innerHTML = `
|
95 |
+
<p><strong>Criterion:</strong> ${data.criterion}</p>
|
96 |
+
<p><strong>Probability of being machine-generated:</strong> ${data.probability_machine_generated}</p>
|
97 |
+
`;
|
98 |
+
}
|
99 |
+
})
|
100 |
+
.catch(err => {
|
101 |
+
resultDiv.innerHTML = `<p class="error">An error occurred: ${err.message}</p>`;
|
102 |
+
});
|
103 |
+
});
|
104 |
+
</script>
|
105 |
+
</body>
|
106 |
+
</html>
|
local_infer.py
ADDED
@@ -0,0 +1,94 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import random
|
6 |
+
|
7 |
+
import numpy as np
|
8 |
+
import torch
|
9 |
+
import os
|
10 |
+
import glob
|
11 |
+
import argparse
|
12 |
+
import json
|
13 |
+
from scripts.model import load_tokenizer, load_model
|
14 |
+
from scripts.fast_detect_gpt import get_sampling_discrepancy_analytic
|
15 |
+
|
16 |
+
|
17 |
+
# estimate the probability according to the distribution of our test results on ChatGPT and GPT-4
|
18 |
+
class ProbEstimator:
|
19 |
+
def __init__(self, args):
|
20 |
+
self.real_crits = []
|
21 |
+
self.fake_crits = []
|
22 |
+
for result_file in glob.glob(os.path.join(args.ref_path, '*.json')):
|
23 |
+
with open(result_file, 'r') as fin:
|
24 |
+
res = json.load(fin)
|
25 |
+
self.real_crits.extend(res['predictions']['real'])
|
26 |
+
self.fake_crits.extend(res['predictions']['samples'])
|
27 |
+
print(f'ProbEstimator: total {len(self.real_crits) * 2} samples.')
|
28 |
+
|
29 |
+
|
30 |
+
def crit_to_prob(self, crit):
|
31 |
+
offset = np.sort(np.abs(np.array(self.real_crits + self.fake_crits) - crit))[100]
|
32 |
+
cnt_real = np.sum((np.array(self.real_crits) > crit - offset) & (np.array(self.real_crits) < crit + offset))
|
33 |
+
cnt_fake = np.sum((np.array(self.fake_crits) > crit - offset) & (np.array(self.fake_crits) < crit + offset))
|
34 |
+
return cnt_fake / (cnt_real + cnt_fake)
|
35 |
+
|
36 |
+
# run interactive local inference
|
37 |
+
def run(args):
|
38 |
+
# load model
|
39 |
+
scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
|
40 |
+
scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
|
41 |
+
scoring_model.eval()
|
42 |
+
if args.reference_model_name != args.scoring_model_name:
|
43 |
+
reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
|
44 |
+
reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
|
45 |
+
reference_model.eval()
|
46 |
+
# evaluate criterion
|
47 |
+
name = "sampling_discrepancy_analytic"
|
48 |
+
criterion_fn = get_sampling_discrepancy_analytic
|
49 |
+
prob_estimator = ProbEstimator(args)
|
50 |
+
# input text
|
51 |
+
print('Local demo for Fast-DetectGPT, where the longer text has more reliable result.')
|
52 |
+
print('')
|
53 |
+
while True:
|
54 |
+
print("Please enter your text: (Press Enter twice to start processing)")
|
55 |
+
lines = []
|
56 |
+
while True:
|
57 |
+
line = input()
|
58 |
+
if len(line) == 0:
|
59 |
+
break
|
60 |
+
lines.append(line)
|
61 |
+
text = "\n".join(lines)
|
62 |
+
if len(text) == 0:
|
63 |
+
break
|
64 |
+
# evaluate text
|
65 |
+
tokenized = scoring_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
66 |
+
labels = tokenized.input_ids[:, 1:]
|
67 |
+
with torch.no_grad():
|
68 |
+
logits_score = scoring_model(**tokenized).logits[:, :-1]
|
69 |
+
if args.reference_model_name == args.scoring_model_name:
|
70 |
+
logits_ref = logits_score
|
71 |
+
else:
|
72 |
+
tokenized = reference_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
|
73 |
+
assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
|
74 |
+
logits_ref = reference_model(**tokenized).logits[:, :-1]
|
75 |
+
crit = criterion_fn(logits_ref, logits_score, labels)
|
76 |
+
# estimate the probability of machine generated text
|
77 |
+
prob = prob_estimator.crit_to_prob(crit)
|
78 |
+
print(f'Fast-DetectGPT criterion is {crit:.4f}, suggesting that the text has a probability of {prob * 100:.0f}% to be machine-generated.')
|
79 |
+
print()
|
80 |
+
|
81 |
+
if __name__ == '__main__':
|
82 |
+
parser = argparse.ArgumentParser()
|
83 |
+
parser.add_argument('--reference_model_name', type=str, default="gpt-neo-2.7B") # use gpt-j-6B for more accurate detection
|
84 |
+
parser.add_argument('--scoring_model_name', type=str, default="gpt-neo-2.7B")
|
85 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
86 |
+
parser.add_argument('--ref_path', type=str, default="./local_infer_ref")
|
87 |
+
parser.add_argument('--device', type=str, default="cuda")
|
88 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
89 |
+
args = parser.parse_args()
|
90 |
+
|
91 |
+
run(args)
|
92 |
+
|
93 |
+
|
94 |
+
|
main.sh
ADDED
@@ -0,0 +1,97 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_main
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum squad writing"
|
18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
for D in $datasets; do
|
22 |
+
for M in $source_models; do
|
23 |
+
echo `date`, Preparing dataset ${D}_${M} ...
|
24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
|
25 |
+
done
|
26 |
+
done
|
27 |
+
|
28 |
+
# White-box Setting
|
29 |
+
echo `date`, Evaluate models in the white-box setting:
|
30 |
+
|
31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
32 |
+
for D in $datasets; do
|
33 |
+
for M in $source_models; do
|
34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
37 |
+
|
38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
41 |
+
done
|
42 |
+
done
|
43 |
+
|
44 |
+
# evaluate DNA-GPT
|
45 |
+
for D in $datasets; do
|
46 |
+
for M in $source_models; do
|
47 |
+
echo `date`, Evaluating DNA-GPT on ${D}_${M} ...
|
48 |
+
python scripts/dna_gpt.py --base_model_name $M --dataset $D \
|
49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
50 |
+
done
|
51 |
+
done
|
52 |
+
|
53 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
54 |
+
for D in $datasets; do
|
55 |
+
for M in $source_models; do
|
56 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
57 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
58 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
59 |
+
# we leverage DetectGPT to generate the perturbations
|
60 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
61 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
62 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
63 |
+
done
|
64 |
+
done
|
65 |
+
|
66 |
+
|
67 |
+
# Black-box Setting
|
68 |
+
echo `date`, Evaluate models in the black-box setting:
|
69 |
+
scoring_models="gpt-neo-2.7B"
|
70 |
+
|
71 |
+
# evaluate Fast-DetectGPT
|
72 |
+
for D in $datasets; do
|
73 |
+
for M in $source_models; do
|
74 |
+
M1=gpt-j-6B # sampling model
|
75 |
+
for M2 in $scoring_models; do
|
76 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
77 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
78 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
79 |
+
done
|
80 |
+
done
|
81 |
+
done
|
82 |
+
|
83 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
84 |
+
for D in $datasets; do
|
85 |
+
for M in $source_models; do
|
86 |
+
M1=t5-3b # perturbation model
|
87 |
+
for M2 in $scoring_models; do
|
88 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
89 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
90 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
91 |
+
# we leverage DetectGPT to generate the perturbations
|
92 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
93 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
94 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
95 |
+
done
|
96 |
+
done
|
97 |
+
done
|
main_ext.sh
ADDED
@@ -0,0 +1,89 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_main_ext
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum squad writing"
|
18 |
+
source_models="bloom-7b1 opt-13b llama-13b llama2-13b"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
for D in $datasets; do
|
22 |
+
for M in $source_models; do
|
23 |
+
echo `date`, Preparing dataset ${D}_${M} ...
|
24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
|
25 |
+
done
|
26 |
+
done
|
27 |
+
exit
|
28 |
+
|
29 |
+
# White-box Setting
|
30 |
+
echo `date`, Evaluate models in the white-box setting:
|
31 |
+
|
32 |
+
# evaluate Fast-DetectGPT and fast baselines
|
33 |
+
for D in $datasets; do
|
34 |
+
for M in $source_models; do
|
35 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
36 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
37 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
38 |
+
|
39 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
40 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
41 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
42 |
+
done
|
43 |
+
done
|
44 |
+
|
45 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
46 |
+
for D in $datasets; do
|
47 |
+
for M in $source_models; do
|
48 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
49 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
50 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
51 |
+
# we leverage DetectGPT to generate the perturbations
|
52 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
53 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
54 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
55 |
+
done
|
56 |
+
done
|
57 |
+
|
58 |
+
|
59 |
+
# Black-box Setting
|
60 |
+
echo `date`, Evaluate models in the black-box setting:
|
61 |
+
scoring_models="gpt-neo-2.7B"
|
62 |
+
|
63 |
+
# evaluate Fast-DetectGPT
|
64 |
+
for D in $datasets; do
|
65 |
+
for M in $source_models; do
|
66 |
+
M1=gpt-j-6B # sampling model
|
67 |
+
for M2 in $scoring_models; do
|
68 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
69 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
70 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
71 |
+
done
|
72 |
+
done
|
73 |
+
done
|
74 |
+
|
75 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
76 |
+
for D in $datasets; do
|
77 |
+
for M in $source_models; do
|
78 |
+
M1=t5-3b # perturbation model
|
79 |
+
for M2 in $scoring_models; do
|
80 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
81 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
82 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
83 |
+
# we leverage DetectGPT to generate the perturbations
|
84 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
85 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
86 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
87 |
+
done
|
88 |
+
done
|
89 |
+
done
|
metrics.py
ADDED
@@ -0,0 +1,26 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
import matplotlib.pyplot as plt
|
7 |
+
from sklearn.metrics import roc_curve, precision_recall_curve, auc
|
8 |
+
|
9 |
+
# 15 colorblind-friendly colors
|
10 |
+
COLORS = ["#0072B2", "#009E73", "#D55E00", "#CC79A7", "#F0E442",
|
11 |
+
"#56B4E9", "#E69F00", "#000000", "#0072B2", "#009E73",
|
12 |
+
"#D55E00", "#CC79A7", "#F0E442", "#56B4E9", "#E69F00"]
|
13 |
+
|
14 |
+
|
15 |
+
def get_roc_metrics(real_preds, sample_preds):
|
16 |
+
fpr, tpr, _ = roc_curve([0] * len(real_preds) + [1] * len(sample_preds), real_preds + sample_preds)
|
17 |
+
roc_auc = auc(fpr, tpr)
|
18 |
+
return fpr.tolist(), tpr.tolist(), float(roc_auc)
|
19 |
+
|
20 |
+
|
21 |
+
def get_precision_recall_metrics(real_preds, sample_preds):
|
22 |
+
precision, recall, _ = precision_recall_curve([0] * len(real_preds) + [1] * len(sample_preds),
|
23 |
+
real_preds + sample_preds)
|
24 |
+
pr_auc = auc(recall, precision)
|
25 |
+
return precision.tolist(), recall.tolist(), float(pr_auc)
|
26 |
+
|
model.py
ADDED
@@ -0,0 +1,79 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer
|
7 |
+
import torch
|
8 |
+
import time
|
9 |
+
import os
|
10 |
+
|
11 |
+
def from_pretrained(cls, model_name, kwargs, cache_dir):
|
12 |
+
# use local model if it exists
|
13 |
+
local_path = os.path.join(cache_dir, 'local.' + model_name.replace("/", "_"))
|
14 |
+
if os.path.exists(local_path):
|
15 |
+
return cls.from_pretrained(local_path, **kwargs)
|
16 |
+
return cls.from_pretrained(model_name, **kwargs, cache_dir=cache_dir)
|
17 |
+
|
18 |
+
# predefined models
|
19 |
+
model_fullnames = { 'gpt2': 'gpt2',
|
20 |
+
'gpt2-xl': 'gpt2-xl',
|
21 |
+
'opt-2.7b': 'facebook/opt-2.7b',
|
22 |
+
'gpt-neo-2.7B': 'EleutherAI/gpt-neo-2.7B',
|
23 |
+
'gpt-j-6B': 'EleutherAI/gpt-j-6B',
|
24 |
+
'gpt-neox-20b': 'EleutherAI/gpt-neox-20b',
|
25 |
+
'mgpt': 'sberbank-ai/mGPT',
|
26 |
+
'pubmedgpt': 'stanford-crfm/pubmedgpt',
|
27 |
+
'mt5-xl': 'google/mt5-xl',
|
28 |
+
'llama-13b': 'huggyllama/llama-13b',
|
29 |
+
'llama2-13b': 'TheBloke/Llama-2-13B-fp16',
|
30 |
+
'bloom-7b1': 'bigscience/bloom-7b1',
|
31 |
+
'opt-13b': 'facebook/opt-13b',
|
32 |
+
}
|
33 |
+
float16_models = ['gpt-j-6B', 'gpt-neox-20b', 'llama-13b', 'llama2-13b', 'bloom-7b1', 'opt-13b']
|
34 |
+
|
35 |
+
def get_model_fullname(model_name):
|
36 |
+
return model_fullnames[model_name] if model_name in model_fullnames else model_name
|
37 |
+
|
38 |
+
def load_model(model_name, device, cache_dir):
|
39 |
+
model_fullname = get_model_fullname(model_name)
|
40 |
+
print(f'Loading model {model_fullname}...')
|
41 |
+
model_kwargs = {}
|
42 |
+
if model_name in float16_models:
|
43 |
+
model_kwargs.update(dict(torch_dtype=torch.float16))
|
44 |
+
if 'gpt-j' in model_name:
|
45 |
+
model_kwargs.update(dict(revision='float16'))
|
46 |
+
model = from_pretrained(AutoModelForCausalLM, model_fullname, model_kwargs, cache_dir)
|
47 |
+
print('Moving model to GPU...', end='', flush=True)
|
48 |
+
start = time.time()
|
49 |
+
model.to(device)
|
50 |
+
print(f'DONE ({time.time() - start:.2f}s)')
|
51 |
+
return model
|
52 |
+
|
53 |
+
def load_tokenizer(model_name, for_dataset, cache_dir):
|
54 |
+
model_fullname = get_model_fullname(model_name)
|
55 |
+
optional_tok_kwargs = {}
|
56 |
+
if "facebook/opt-" in model_fullname:
|
57 |
+
print("Using non-fast tokenizer for OPT")
|
58 |
+
optional_tok_kwargs['fast'] = False
|
59 |
+
if for_dataset in ['pubmed']:
|
60 |
+
optional_tok_kwargs['padding_side'] = 'left'
|
61 |
+
else:
|
62 |
+
optional_tok_kwargs['padding_side'] = 'right'
|
63 |
+
base_tokenizer = from_pretrained(AutoTokenizer, model_fullname, optional_tok_kwargs, cache_dir=cache_dir)
|
64 |
+
if base_tokenizer.pad_token_id is None:
|
65 |
+
base_tokenizer.pad_token_id = base_tokenizer.eos_token_id
|
66 |
+
if '13b' in model_fullname:
|
67 |
+
base_tokenizer.pad_token_id = 0
|
68 |
+
return base_tokenizer
|
69 |
+
|
70 |
+
|
71 |
+
if __name__ == '__main__':
|
72 |
+
import argparse
|
73 |
+
parser = argparse.ArgumentParser()
|
74 |
+
parser.add_argument('--model_name', type=str, default="bloom-7b1")
|
75 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
76 |
+
args = parser.parse_args()
|
77 |
+
|
78 |
+
load_tokenizer(args.model_name, 'xsum', args.cache_dir)
|
79 |
+
load_model(args.model_name, 'cpu', args.cache_dir)
|
paraphrasing.py
ADDED
@@ -0,0 +1,106 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
import random
|
2 |
+
|
3 |
+
import torch
|
4 |
+
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
5 |
+
import numpy as np
|
6 |
+
import nltk
|
7 |
+
from data_builder import load_data, save_data
|
8 |
+
from model import from_pretrained
|
9 |
+
|
10 |
+
class T5Paraphraser:
|
11 |
+
def __init__(self, args):
|
12 |
+
self.device = args.device
|
13 |
+
self.tokenizer = from_pretrained(AutoTokenizer, args.t5_model_name, {}, args.cache_dir)
|
14 |
+
self.model = from_pretrained(AutoModelForSeq2SeqLM, args.t5_model_name, {}, args.cache_dir)
|
15 |
+
self.model = self.model.to(args.device)
|
16 |
+
self.model.eval()
|
17 |
+
|
18 |
+
def paraphrase(self, sents):
|
19 |
+
parabatch = ["paraphrase: " + sent + " </s>" for sent in sents]
|
20 |
+
encoding = self.tokenizer(parabatch, padding=True, return_tensors="pt")
|
21 |
+
input_ids, attention_masks = encoding["input_ids"].to(self.device), encoding["attention_mask"].to(self.device)
|
22 |
+
outputs = self.model.generate(
|
23 |
+
input_ids=input_ids, attention_mask=attention_masks,
|
24 |
+
max_length=256,
|
25 |
+
do_sample=True,
|
26 |
+
top_k=200,
|
27 |
+
top_p=0.95,
|
28 |
+
early_stopping=True,
|
29 |
+
num_return_sequences=1
|
30 |
+
)
|
31 |
+
assert len(sents) == len(outputs)
|
32 |
+
results = []
|
33 |
+
for output, sent in zip(outputs, sents):
|
34 |
+
line = self.tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
|
35 |
+
line = line.strip()
|
36 |
+
line = line if len(line) > 0 else sent
|
37 |
+
results.append(line)
|
38 |
+
return results
|
39 |
+
|
40 |
+
class RandomParaphraser:
|
41 |
+
def __init__(self, args):
|
42 |
+
self.device = args.device
|
43 |
+
|
44 |
+
def paraphrase(self, sents):
|
45 |
+
results = []
|
46 |
+
for sent in sents:
|
47 |
+
words = sent.split()
|
48 |
+
if len(words) > 20:
|
49 |
+
idx = random.randint(0, len(words) - 2)
|
50 |
+
words[idx], words[idx+1] = words[idx+1], words[idx]
|
51 |
+
results.append(' '.join(words))
|
52 |
+
return results
|
53 |
+
|
54 |
+
def generate_data(args):
|
55 |
+
data = load_data(args.dataset_file)
|
56 |
+
originals = data['original']
|
57 |
+
samples = data['sampled']
|
58 |
+
print(f"Total number of samples: {len(samples)}")
|
59 |
+
print(f"Average number of words: {np.mean([len(x.split()) for x in samples])}")
|
60 |
+
|
61 |
+
if args.do_random_para:
|
62 |
+
print(f'Using random paraphraser.')
|
63 |
+
paraphraser = RandomParaphraser(args)
|
64 |
+
else:
|
65 |
+
print(f'Loading model {args.t5_model_name}...')
|
66 |
+
paraphraser = T5Paraphraser(args)
|
67 |
+
|
68 |
+
new_samples = []
|
69 |
+
for sample in tqdm(samples):
|
70 |
+
lines = sample.split('\n')
|
71 |
+
new_lines = []
|
72 |
+
for line in lines:
|
73 |
+
line = line.strip()
|
74 |
+
if len(line) == 0:
|
75 |
+
new_lines.append(line)
|
76 |
+
else:
|
77 |
+
sents = nltk.sent_tokenize(line)
|
78 |
+
new_sents = paraphraser.paraphrase(sents)
|
79 |
+
new_lines.append(' '.join(new_sents))
|
80 |
+
new_samples.append('\n'.join(new_lines))
|
81 |
+
|
82 |
+
new_data = {'original': originals, 'sampled': new_samples}
|
83 |
+
save_data(args.output_file, args, new_data)
|
84 |
+
|
85 |
+
|
86 |
+
if __name__ == '__main__':
|
87 |
+
import argparse
|
88 |
+
from tqdm import tqdm
|
89 |
+
parser = argparse.ArgumentParser()
|
90 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
91 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
92 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
93 |
+
parser.add_argument('--t5_model_name', type=str, default="Vamsi/T5_Paraphrase_Paws")
|
94 |
+
parser.add_argument('--paraphraser', type=str, default="t5", choices=["t5", "random"])
|
95 |
+
parser.add_argument('--seed', type=int, default=0)
|
96 |
+
parser.add_argument('--device', type=str, default="cuda")
|
97 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
98 |
+
args = parser.parse_args()
|
99 |
+
|
100 |
+
torch.manual_seed(args.seed)
|
101 |
+
np.random.seed(args.seed)
|
102 |
+
|
103 |
+
import nltk
|
104 |
+
nltk.download('punkt')
|
105 |
+
|
106 |
+
generate_data(args)
|
report_results.py
ADDED
@@ -0,0 +1,490 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
import os.path
|
6 |
+
import argparse
|
7 |
+
import json
|
8 |
+
import numpy as np
|
9 |
+
|
10 |
+
|
11 |
+
def save_lines(lines, file):
|
12 |
+
with open(file, 'w') as fout:
|
13 |
+
fout.write('\n'.join(lines))
|
14 |
+
|
15 |
+
def get_auroc(result_file):
|
16 |
+
with open(result_file, 'r') as fin:
|
17 |
+
res = json.load(fin)
|
18 |
+
return res['metrics']['roc_auc']
|
19 |
+
|
20 |
+
def get_fpr_tpr(result_file):
|
21 |
+
with open(result_file, 'r') as fin:
|
22 |
+
res = json.load(fin)
|
23 |
+
return res['metrics']['fpr'], res['metrics']['tpr']
|
24 |
+
|
25 |
+
def report_main_results(args):
|
26 |
+
datasets = {'xsum': 'XSum',
|
27 |
+
'squad': 'SQuAD',
|
28 |
+
'writing': 'WritingPrompts'}
|
29 |
+
source_models = {'gpt2-xl': 'GPT-2',
|
30 |
+
'opt-2.7b': 'OPT-2.7',
|
31 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
32 |
+
'gpt-j-6B': 'GPT-J',
|
33 |
+
'gpt-neox-20b': 'NeoX'}
|
34 |
+
methods1 = {'likelihood': 'Likelihood',
|
35 |
+
'entropy': 'Entropy',
|
36 |
+
'logrank': 'LogRank',
|
37 |
+
'lrr': 'LRR',
|
38 |
+
'npr': 'NPR'}
|
39 |
+
methods2 = {'perturbation_100': 'DetectGPT',
|
40 |
+
'sampling_discrepancy': 'Fast-DetectGPT'}
|
41 |
+
|
42 |
+
def _get_method_aurocs(dataset, method, filter=''):
|
43 |
+
cols = []
|
44 |
+
for model in source_models:
|
45 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
46 |
+
if os.path.exists(result_file):
|
47 |
+
auroc = get_auroc(result_file)
|
48 |
+
else:
|
49 |
+
auroc = 0.0
|
50 |
+
cols.append(auroc)
|
51 |
+
cols.append(np.mean(cols))
|
52 |
+
return cols
|
53 |
+
|
54 |
+
headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
|
55 |
+
for dataset in datasets:
|
56 |
+
print('----')
|
57 |
+
print(datasets[dataset])
|
58 |
+
print('----')
|
59 |
+
print(' '.join(headers))
|
60 |
+
# basic methods
|
61 |
+
for method in methods1:
|
62 |
+
method_name = methods1[method]
|
63 |
+
cols = _get_method_aurocs(dataset, method)
|
64 |
+
cols = [f'{col:.4f}' for col in cols]
|
65 |
+
print(method_name, ' '.join(cols))
|
66 |
+
# white-box comparison
|
67 |
+
results = {}
|
68 |
+
for method in methods2:
|
69 |
+
method_name = methods2[method]
|
70 |
+
cols = _get_method_aurocs(dataset, method)
|
71 |
+
results[method_name] = cols
|
72 |
+
cols = [f'{col:.4f}' for col in cols]
|
73 |
+
print(method_name, ' '.join(cols))
|
74 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
75 |
+
cols = [f'{col:.4f}' for col in cols]
|
76 |
+
print('(Diff)', ' '.join(cols))
|
77 |
+
# black-box comparison
|
78 |
+
filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
|
79 |
+
'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
|
80 |
+
results = {}
|
81 |
+
for method in methods2:
|
82 |
+
method_name = methods2[method]
|
83 |
+
cols = _get_method_aurocs(dataset, method, filters[method])
|
84 |
+
results[method_name] = cols
|
85 |
+
cols = [f'{col:.4f}' for col in cols]
|
86 |
+
print(method_name, ' '.join(cols))
|
87 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
88 |
+
cols = [f'{col:.4f}' for col in cols]
|
89 |
+
print('(Diff)', ' '.join(cols))
|
90 |
+
|
91 |
+
def report_main_ext_results(args):
|
92 |
+
datasets = {'xsum': 'XSum',
|
93 |
+
'squad': 'SQuAD',
|
94 |
+
'writing': 'WritingPrompts'}
|
95 |
+
source_models = {'bloom-7b1': 'BLOOM-7.1',
|
96 |
+
'opt-13b': 'OPT-13',
|
97 |
+
'llama-13b': 'Llama-13',
|
98 |
+
'llama2-13b': 'Llama2-13',
|
99 |
+
}
|
100 |
+
methods1 = {'likelihood': 'Likelihood',
|
101 |
+
'entropy': 'Entropy',
|
102 |
+
'logrank': 'LogRank',
|
103 |
+
'lrr': 'LRR',
|
104 |
+
'npr': 'NPR'}
|
105 |
+
methods2 = {'perturbation_100': 'DetectGPT',
|
106 |
+
'sampling_discrepancy': 'Fast-DetectGPT'}
|
107 |
+
|
108 |
+
def _get_method_aurocs(dataset, method, filter=''):
|
109 |
+
cols = []
|
110 |
+
for model in source_models:
|
111 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
112 |
+
if os.path.exists(result_file):
|
113 |
+
auroc = get_auroc(result_file)
|
114 |
+
else:
|
115 |
+
auroc = 0.0
|
116 |
+
cols.append(auroc)
|
117 |
+
cols.append(np.mean(cols))
|
118 |
+
return cols
|
119 |
+
|
120 |
+
headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
|
121 |
+
for dataset in datasets:
|
122 |
+
print('----')
|
123 |
+
print(datasets[dataset])
|
124 |
+
print('----')
|
125 |
+
print(' '.join(headers))
|
126 |
+
# basic methods
|
127 |
+
for method in methods1:
|
128 |
+
method_name = methods1[method]
|
129 |
+
cols = _get_method_aurocs(dataset, method)
|
130 |
+
cols = [f'{col:.4f}' for col in cols]
|
131 |
+
print(method_name, ' '.join(cols))
|
132 |
+
# white-box comparison
|
133 |
+
results = {}
|
134 |
+
for method in methods2:
|
135 |
+
method_name = methods2[method]
|
136 |
+
cols = _get_method_aurocs(dataset, method)
|
137 |
+
results[method_name] = cols
|
138 |
+
cols = [f'{col:.4f}' for col in cols]
|
139 |
+
print(method_name, ' '.join(cols))
|
140 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
141 |
+
cols = [f'{col:.4f}' for col in cols]
|
142 |
+
print('(Diff)', ' '.join(cols))
|
143 |
+
# black-box comparison
|
144 |
+
filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
|
145 |
+
'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
|
146 |
+
results = {}
|
147 |
+
for method in methods2:
|
148 |
+
method_name = methods2[method]
|
149 |
+
cols = _get_method_aurocs(dataset, method, filters[method])
|
150 |
+
results[method_name] = cols
|
151 |
+
cols = [f'{col:.4f}' for col in cols]
|
152 |
+
print(method_name, ' '.join(cols))
|
153 |
+
cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
|
154 |
+
cols = [f'{col:.4f}' for col in cols]
|
155 |
+
print('(Diff)', ' '.join(cols))
|
156 |
+
|
157 |
+
def report_refmodel_results(args):
|
158 |
+
datasets = {'xsum': 'XSum',
|
159 |
+
'squad': 'SQuAD',
|
160 |
+
'writing': 'WritingPrompts'}
|
161 |
+
source_models = {'gpt2-xl': 'GPT-2',
|
162 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
163 |
+
'gpt-j-6B': 'GPT-J'}
|
164 |
+
|
165 |
+
def _get_method_aurocs(method, ref_model=None):
|
166 |
+
cols = []
|
167 |
+
for dataset in datasets:
|
168 |
+
for model in source_models:
|
169 |
+
filter = '' if ref_model is None or ref_model == model else f'.{ref_model}_{model}'
|
170 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
171 |
+
if os.path.exists(result_file):
|
172 |
+
auroc = get_auroc(result_file)
|
173 |
+
else:
|
174 |
+
auroc = 0.0
|
175 |
+
cols.append(auroc)
|
176 |
+
cols.append(np.mean(cols))
|
177 |
+
return cols
|
178 |
+
|
179 |
+
headers1 = ['----'] + list([datasets[d] for d in datasets])
|
180 |
+
headers2 = ['Method'] + [source_models[model] for model in source_models] \
|
181 |
+
+ [source_models[model] for model in source_models] \
|
182 |
+
+ [source_models[model] for model in source_models] \
|
183 |
+
+ ['Avg.']
|
184 |
+
print(' '.join(headers1))
|
185 |
+
print(' '.join(headers2))
|
186 |
+
|
187 |
+
ref_models = [None, 'gpt2-xl', 'gpt-neo-2.7B', 'gpt-j-6B']
|
188 |
+
for ref_model in ref_models:
|
189 |
+
method = 'sampling_discrepancy'
|
190 |
+
method_name = 'Fast-DetectGPT (*/*)' if ref_model is None else f'Fast-DetectGPT ({source_models[ref_model]}/*)'
|
191 |
+
cols = _get_method_aurocs(method, ref_model)
|
192 |
+
cols = [f'{col:.4f}' for col in cols]
|
193 |
+
print(method_name, ' '.join(cols))
|
194 |
+
|
195 |
+
|
196 |
+
def report_chatgpt_gpt4_results(args):
|
197 |
+
datasets = {'xsum': 'XSum',
|
198 |
+
'writing': 'Writing',
|
199 |
+
'pubmed': 'PubMed'}
|
200 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
201 |
+
'gpt-4': 'GPT-4'}
|
202 |
+
score_models = { 't5-11b': 'T5-11B',
|
203 |
+
'gpt2-xl': 'GPT-2',
|
204 |
+
'opt-2.7b': 'OPT-2.7',
|
205 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
206 |
+
'gpt-j-6B': 'GPT-J',
|
207 |
+
'gpt-neox-20b': 'NeoX'}
|
208 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
209 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
210 |
+
methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
|
211 |
+
methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
|
212 |
+
'sampling_discrepancy_analytic': 'Fast'}
|
213 |
+
|
214 |
+
def _get_method_aurocs(method, filter=''):
|
215 |
+
results = []
|
216 |
+
for model in source_models:
|
217 |
+
cols = []
|
218 |
+
for dataset in datasets:
|
219 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
220 |
+
if os.path.exists(result_file):
|
221 |
+
auroc = get_auroc(result_file)
|
222 |
+
else:
|
223 |
+
auroc = 0.0
|
224 |
+
cols.append(auroc)
|
225 |
+
cols.append(np.mean(cols))
|
226 |
+
results.extend(cols)
|
227 |
+
return results
|
228 |
+
|
229 |
+
headers1 = ['--'] + [source_models[model] for model in source_models]
|
230 |
+
headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
|
231 |
+
+ [datasets[dataset] for dataset in datasets] + ['Avg.']
|
232 |
+
print(' '.join(headers1))
|
233 |
+
print(' '.join(headers2))
|
234 |
+
# supervised methods
|
235 |
+
for method in methods1:
|
236 |
+
method_name = methods1[method]
|
237 |
+
cols = _get_method_aurocs(method)
|
238 |
+
cols = [f'{col:.4f}' for col in cols]
|
239 |
+
print(method_name, ' '.join(cols))
|
240 |
+
# zero-shot methods
|
241 |
+
|
242 |
+
filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
243 |
+
'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
244 |
+
'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
|
245 |
+
filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
246 |
+
'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
247 |
+
'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
248 |
+
'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
|
249 |
+
for method in methods2:
|
250 |
+
for filter in filters2[method]:
|
251 |
+
setting = score_models[filter[1:]]
|
252 |
+
method_name = f'{methods2[method]}({setting})'
|
253 |
+
cols = _get_method_aurocs(method, filter)
|
254 |
+
cols = [f'{col:.4f}' for col in cols]
|
255 |
+
print(method_name, ' '.join(cols))
|
256 |
+
for method in methods3:
|
257 |
+
for filter in filters3[method]:
|
258 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
259 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
260 |
+
cols = _get_method_aurocs(method, filter)
|
261 |
+
cols = [f'{col:.4f}' for col in cols]
|
262 |
+
print(method_name, ' '.join(cols))
|
263 |
+
|
264 |
+
def report_gpt3_results(args):
|
265 |
+
datasets = {'xsum': 'XSum',
|
266 |
+
'writing': 'Writing',
|
267 |
+
'pubmed': 'PubMed'}
|
268 |
+
source_models = {'davinci': 'GPT-3'}
|
269 |
+
score_models = { 't5-11b': 'T5-11B',
|
270 |
+
'gpt2-xl': 'GPT-2',
|
271 |
+
'opt-2.7b': 'OPT-2.7',
|
272 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
273 |
+
'gpt-j-6B': 'GPT-J',
|
274 |
+
'gpt-neox-20b': 'NeoX'}
|
275 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
276 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
277 |
+
methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
|
278 |
+
methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
|
279 |
+
'sampling_discrepancy_analytic': 'Fast'}
|
280 |
+
|
281 |
+
def _get_method_aurocs(method, filter=''):
|
282 |
+
results = []
|
283 |
+
for model in source_models:
|
284 |
+
cols = []
|
285 |
+
for dataset in datasets:
|
286 |
+
result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
|
287 |
+
if os.path.exists(result_file):
|
288 |
+
auroc = get_auroc(result_file)
|
289 |
+
else:
|
290 |
+
auroc = 0.0
|
291 |
+
cols.append(auroc)
|
292 |
+
cols.append(np.mean(cols))
|
293 |
+
results.extend(cols)
|
294 |
+
return results
|
295 |
+
|
296 |
+
headers1 = ['--'] + [source_models[model] for model in source_models]
|
297 |
+
headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
|
298 |
+
+ [datasets[dataset] for dataset in datasets] + ['Avg.']
|
299 |
+
print(' '.join(headers1))
|
300 |
+
print(' '.join(headers2))
|
301 |
+
# supervised methods
|
302 |
+
for method in methods1:
|
303 |
+
method_name = methods1[method]
|
304 |
+
cols = _get_method_aurocs(method)
|
305 |
+
cols = [f'{col:.4f}' for col in cols]
|
306 |
+
print(method_name, ' '.join(cols))
|
307 |
+
# zero-shot methods
|
308 |
+
|
309 |
+
filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
310 |
+
'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
|
311 |
+
'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
|
312 |
+
filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
313 |
+
'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
314 |
+
'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
|
315 |
+
'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
|
316 |
+
for method in methods2:
|
317 |
+
for filter in filters2[method]:
|
318 |
+
setting = score_models[filter[1:]]
|
319 |
+
method_name = f'{methods2[method]}({setting})'
|
320 |
+
cols = _get_method_aurocs(method, filter)
|
321 |
+
cols = [f'{col:.4f}' for col in cols]
|
322 |
+
print(method_name, ' '.join(cols))
|
323 |
+
for method in methods3:
|
324 |
+
for filter in filters3[method]:
|
325 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
326 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
327 |
+
cols = _get_method_aurocs(method, filter)
|
328 |
+
cols = [f'{col:.4f}' for col in cols]
|
329 |
+
print(method_name, ' '.join(cols))
|
330 |
+
|
331 |
+
def report_maxlen_trends(args):
|
332 |
+
datasets = {'xsum': 'XSum',
|
333 |
+
'writing': 'WritingPrompts'}
|
334 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
335 |
+
'gpt-4': 'GPT-4'}
|
336 |
+
score_models = {'t5-11b': 'T5-11B',
|
337 |
+
'gpt2-xl': 'GPT-2',
|
338 |
+
'opt-2.7b': 'OPT-2.7',
|
339 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
340 |
+
'gpt-j-6B': 'GPT-J',
|
341 |
+
'gpt-neox-20b': 'NeoX'}
|
342 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
343 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
344 |
+
methods2 = {'likelihood': 'Likelihood'}
|
345 |
+
methods3 = {'perturbation_100': 'DetectGPT',
|
346 |
+
'sampling_discrepancy_analytic': 'Fast-Detect'}
|
347 |
+
maxlens = [30, 60, 90, 120, 150, 180]
|
348 |
+
|
349 |
+
def _get_method_aurocs(root_path, dataset, source_model, method, filter=''):
|
350 |
+
cols = []
|
351 |
+
for maxlen in maxlens:
|
352 |
+
result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
|
353 |
+
if os.path.exists(result_file):
|
354 |
+
auroc = get_auroc(result_file)
|
355 |
+
else:
|
356 |
+
auroc = 0.0
|
357 |
+
cols.append(auroc)
|
358 |
+
return cols
|
359 |
+
|
360 |
+
filters2 = {'likelihood': '.gpt-neo-2.7B'}
|
361 |
+
filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
|
362 |
+
'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
|
363 |
+
|
364 |
+
headers = ['Method'] + [str(maxlen) for maxlen in maxlens]
|
365 |
+
print(' '.join(headers))
|
366 |
+
# print table per model and dataset
|
367 |
+
results = {}
|
368 |
+
for model in source_models:
|
369 |
+
model_name = source_models[model]
|
370 |
+
for data in datasets:
|
371 |
+
data_name = datasets[data]
|
372 |
+
print('----')
|
373 |
+
print(f'{model_name} / {data_name}')
|
374 |
+
print('----')
|
375 |
+
for method in methods1:
|
376 |
+
method_name = methods1[method]
|
377 |
+
cols = _get_method_aurocs('.', data, model, method)
|
378 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
379 |
+
cols = [f'{col:.4f}' for col in cols]
|
380 |
+
print(method_name, ' '.join(cols))
|
381 |
+
for method in methods2:
|
382 |
+
filter = filters2[method]
|
383 |
+
setting = score_models[filter[1:]]
|
384 |
+
method_name = f'{methods2[method]}({setting})'
|
385 |
+
cols = _get_method_aurocs('.', data, model, method, filter)
|
386 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
387 |
+
cols = [f'{col:.4f}' for col in cols]
|
388 |
+
print(method_name, ' '.join(cols))
|
389 |
+
for method in methods3:
|
390 |
+
filter = filters3[method]
|
391 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
392 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
393 |
+
cols = _get_method_aurocs('.', data, model, method, filter)
|
394 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
395 |
+
cols = [f'{col:.4f}' for col in cols]
|
396 |
+
print(method_name, ' '.join(cols))
|
397 |
+
import json
|
398 |
+
json_file = './exp_analysis/maxlen_trends.json'
|
399 |
+
with open(json_file, 'w') as fout:
|
400 |
+
json.dump(results, fout)
|
401 |
+
print(f'Write to file {json_file}')
|
402 |
+
|
403 |
+
def report_auroc_curve(args):
|
404 |
+
datasets = {'xsum': 'XSum',
|
405 |
+
'writing': 'WritingPrompts'}
|
406 |
+
source_models = {'gpt-3.5-turbo': 'ChatGPT',
|
407 |
+
'gpt-4': 'GPT-4'}
|
408 |
+
score_models = {'t5-11b': 'T5-11B',
|
409 |
+
'gpt2-xl': 'GPT-2',
|
410 |
+
'opt-2.7b': 'OPT-2.7',
|
411 |
+
'gpt-neo-2.7B': 'Neo-2.7',
|
412 |
+
'gpt-j-6B': 'GPT-J',
|
413 |
+
'gpt-neox-20b': 'NeoX'}
|
414 |
+
methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
|
415 |
+
'roberta-large-openai-detector': 'RoBERTa-large'}
|
416 |
+
methods2 = {'likelihood': 'Likelihood'}
|
417 |
+
methods3 = {'perturbation_100': 'DetectGPT',
|
418 |
+
'sampling_discrepancy_analytic': 'Fast-Detect'}
|
419 |
+
|
420 |
+
def _get_method_fpr_tpr(root_path, dataset, source_model, method, filter=''):
|
421 |
+
maxlen = 180
|
422 |
+
result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
|
423 |
+
if os.path.exists(result_file):
|
424 |
+
fpr, tpr = get_fpr_tpr(result_file)
|
425 |
+
else:
|
426 |
+
fpr, tpr = [], []
|
427 |
+
assert len(fpr) == len(tpr)
|
428 |
+
return list(zip(fpr, tpr))
|
429 |
+
|
430 |
+
filters2 = {'likelihood': '.gpt-neo-2.7B'}
|
431 |
+
filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
|
432 |
+
'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
|
433 |
+
|
434 |
+
# print table per model and dataset
|
435 |
+
results = {}
|
436 |
+
for model in source_models:
|
437 |
+
model_name = source_models[model]
|
438 |
+
for data in datasets:
|
439 |
+
data_name = datasets[data]
|
440 |
+
print('----')
|
441 |
+
print(f'{model_name} / {data_name}')
|
442 |
+
print('----')
|
443 |
+
for method in methods1:
|
444 |
+
method_name = methods1[method]
|
445 |
+
cols = _get_method_fpr_tpr('.', data, model, method)
|
446 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
447 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
448 |
+
print(method_name, ' '.join(cols))
|
449 |
+
for method in methods2:
|
450 |
+
filter = filters2[method]
|
451 |
+
setting = score_models[filter[1:]]
|
452 |
+
method_name = f'{methods2[method]}({setting})'
|
453 |
+
cols = _get_method_fpr_tpr('.', data, model, method, filter)
|
454 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
455 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
456 |
+
print(method_name, ' '.join(cols))
|
457 |
+
for method in methods3:
|
458 |
+
filter = filters3[method]
|
459 |
+
setting = [score_models[model] for model in filter[1:].split('_')]
|
460 |
+
method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
|
461 |
+
cols = _get_method_fpr_tpr('.', data, model, method, filter)
|
462 |
+
results[f'{model_name}_{data_name}_{method_name}'] = cols
|
463 |
+
cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
|
464 |
+
print(method_name, ' '.join(cols))
|
465 |
+
import json
|
466 |
+
json_file = './exp_analysis/auroc_curve.json'
|
467 |
+
with open(json_file, 'w') as fout:
|
468 |
+
json.dump(results, fout)
|
469 |
+
print(f'Write to file {json_file}')
|
470 |
+
|
471 |
+
if __name__ == '__main__':
|
472 |
+
parser = argparse.ArgumentParser()
|
473 |
+
parser.add_argument('--result_path', type=str, default="./exp_main/results/")
|
474 |
+
parser.add_argument('--report_name', type=str, default="main_results")
|
475 |
+
args = parser.parse_args()
|
476 |
+
|
477 |
+
if args.report_name == 'main_results':
|
478 |
+
report_main_results(args)
|
479 |
+
elif args.report_name == 'main_ext_results':
|
480 |
+
report_main_ext_results(args)
|
481 |
+
elif args.report_name == 'chatgpt_gpt4_results':
|
482 |
+
report_chatgpt_gpt4_results(args)
|
483 |
+
elif args.report_name == 'gpt3_results':
|
484 |
+
report_gpt3_results(args)
|
485 |
+
elif args.report_name == 'maxlen_trends':
|
486 |
+
report_maxlen_trends(args)
|
487 |
+
elif args.report_name == 'auroc_curve':
|
488 |
+
report_auroc_curve(args)
|
489 |
+
elif args.report_name == 'refmodel_results':
|
490 |
+
report_refmodel_results(args)
|
requirements.txt
CHANGED
@@ -1,3 +1,8 @@
|
|
1 |
-
|
2 |
-
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
torch
|
2 |
+
numpy
|
3 |
+
transformers==4.28.1
|
4 |
+
datasets==2.12.0
|
5 |
+
matplotlib
|
6 |
+
tqdm
|
7 |
+
openai
|
8 |
+
nltk
|
setup.sh
ADDED
@@ -0,0 +1 @@
|
|
|
|
|
1 |
+
pip install -r requirements.txt
|
show_result.py
ADDED
@@ -0,0 +1,51 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
import matplotlib
|
7 |
+
import matplotlib.pyplot as plt
|
8 |
+
import argparse
|
9 |
+
import glob
|
10 |
+
import json
|
11 |
+
from os import path
|
12 |
+
|
13 |
+
import numpy as np
|
14 |
+
|
15 |
+
matplotlib.use('Agg')
|
16 |
+
|
17 |
+
# plot histogram of sampled on left, and original on right
|
18 |
+
def save_histogram(predictions, figure_file):
|
19 |
+
plt.figure(figsize=(4, 2.5))
|
20 |
+
plt.subplot(1, 1, 1)
|
21 |
+
plt.hist(predictions["samples"], alpha=0.5, bins='auto', label='Model')
|
22 |
+
plt.hist(predictions["real"], alpha=0.5, bins='auto', label='Human')
|
23 |
+
plt.xlabel("Sampling Discrepancy")
|
24 |
+
plt.ylabel('Frequency')
|
25 |
+
plt.legend(loc='upper right')
|
26 |
+
plt.tight_layout()
|
27 |
+
plt.savefig(figure_file)
|
28 |
+
|
29 |
+
if __name__ == '__main__':
|
30 |
+
parser = argparse.ArgumentParser()
|
31 |
+
parser.add_argument('--result_files', type=str, default="./exp_test/results/*.json")
|
32 |
+
parser.add_argument('--draw', action='store_true')
|
33 |
+
args = parser.parse_args()
|
34 |
+
|
35 |
+
for res_file in glob.glob(args.result_files, recursive=True):
|
36 |
+
with open(res_file, 'r') as fin:
|
37 |
+
res = json.load(fin)
|
38 |
+
if 'metrics' in res:
|
39 |
+
n_samples = res['info']['n_samples']
|
40 |
+
roc_auc = res['metrics']['roc_auc']
|
41 |
+
real = res['predictions']['real']
|
42 |
+
samples = res['predictions']['samples']
|
43 |
+
print(f"{res_file}: roc_auc={roc_auc:.4f} n_samples={n_samples} r:{np.mean(real):.2f}/{np.std(real):.2f} s:{np.mean(samples):.2f}/{np.std(samples):.2f}")
|
44 |
+
else:
|
45 |
+
print(f"{res_file}: metrics not found.")
|
46 |
+
# draw histogram
|
47 |
+
if args.draw:
|
48 |
+
fig_file = f"{res_file}.pdf"
|
49 |
+
save_histogram(res['predictions'], fig_file)
|
50 |
+
print(f"{fig_file}: histogram figure saved.")
|
51 |
+
|
supervised.py
ADDED
@@ -0,0 +1,78 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Copyright (c) Guangsheng Bao.
|
2 |
+
#
|
3 |
+
# This source code is licensed under the MIT license found in the
|
4 |
+
# LICENSE file in the root directory of this source tree.
|
5 |
+
|
6 |
+
import numpy as np
|
7 |
+
import torch
|
8 |
+
from transformers import AutoModelForSequenceClassification, AutoTokenizer
|
9 |
+
import tqdm
|
10 |
+
import argparse
|
11 |
+
import json
|
12 |
+
from data_builder import load_data
|
13 |
+
from metrics import get_roc_metrics, get_precision_recall_metrics
|
14 |
+
from model import from_pretrained
|
15 |
+
|
16 |
+
def experiment(args):
|
17 |
+
# load model
|
18 |
+
print(f'Beginning supervised evaluation with {args.model_name}...')
|
19 |
+
detector = from_pretrained(AutoModelForSequenceClassification, args.model_name, {}, args.cache_dir).to(args.device)
|
20 |
+
tokenizer = from_pretrained(AutoTokenizer, args.model_name, {}, args.cache_dir)
|
21 |
+
detector.eval()
|
22 |
+
# load data
|
23 |
+
data = load_data(args.dataset_file)
|
24 |
+
n_samples = len(data["sampled"])
|
25 |
+
# eval detector
|
26 |
+
name = args.model_name
|
27 |
+
torch.manual_seed(args.seed)
|
28 |
+
np.random.seed(args.seed)
|
29 |
+
eval_results = []
|
30 |
+
for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
|
31 |
+
original_text = data["original"][idx]
|
32 |
+
sampled_text = data["sampled"][idx]
|
33 |
+
# original text
|
34 |
+
tokenized = tokenizer(original_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
|
35 |
+
with torch.no_grad():
|
36 |
+
original_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
|
37 |
+
# sampled text
|
38 |
+
tokenized = tokenizer(sampled_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
|
39 |
+
with torch.no_grad():
|
40 |
+
sampled_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
|
41 |
+
# result
|
42 |
+
eval_results.append({"original": original_text,
|
43 |
+
"original_crit": original_crit,
|
44 |
+
"sampled": sampled_text,
|
45 |
+
"sampled_crit": sampled_crit})
|
46 |
+
|
47 |
+
# compute prediction scores for real/sampled passages
|
48 |
+
predictions = {'real': [x["original_crit"] for x in eval_results],
|
49 |
+
'samples': [x["sampled_crit"] for x in eval_results]}
|
50 |
+
fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
|
51 |
+
p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
|
52 |
+
print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
|
53 |
+
# log results
|
54 |
+
results_file = f'{args.output_file}.{name}.json'
|
55 |
+
results = { 'name': f'{name}_threshold',
|
56 |
+
'info': {'n_samples': n_samples},
|
57 |
+
'predictions': predictions,
|
58 |
+
'raw_results': eval_results,
|
59 |
+
'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
|
60 |
+
'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
|
61 |
+
'loss': 1 - pr_auc}
|
62 |
+
with open(results_file, 'w') as fout:
|
63 |
+
json.dump(results, fout)
|
64 |
+
print(f'Results written into {results_file}')
|
65 |
+
|
66 |
+
|
67 |
+
if __name__ == '__main__':
|
68 |
+
parser = argparse.ArgumentParser()
|
69 |
+
parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
|
70 |
+
parser.add_argument('--dataset', type=str, default="xsum")
|
71 |
+
parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
|
72 |
+
parser.add_argument('--model_name', type=str, default="roberta-base-openai-detector")
|
73 |
+
parser.add_argument('--seed', type=int, default=0)
|
74 |
+
parser.add_argument('--device', type=str, default="cuda")
|
75 |
+
parser.add_argument('--cache_dir', type=str, default="../cache")
|
76 |
+
args = parser.parse_args()
|
77 |
+
|
78 |
+
experiment(args)
|
supervised.sh
ADDED
@@ -0,0 +1,56 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_supervised
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
# preparing dataset
|
18 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
19 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
20 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
21 |
+
python scripts/data_builder.py --dataset $D --n_samples 200 --base_model_name $M --output_file $data_path/${D}_${M}
|
22 |
+
done
|
23 |
+
|
24 |
+
# evaluate baselines
|
25 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
26 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
27 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
28 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
29 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
30 |
+
done
|
31 |
+
|
32 |
+
# evaluate supervised detectors
|
33 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
34 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
35 |
+
for SM in roberta-base-openai-detector roberta-large-openai-detector; do
|
36 |
+
echo `date`, Evaluating ${SM} on ${D}_${M} ...
|
37 |
+
python scripts/supervised.py --model_name $SM --dataset $D \
|
38 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
39 |
+
done
|
40 |
+
done
|
41 |
+
|
42 |
+
# evaluate DetectGPT
|
43 |
+
for P in "english:mgpt:mt5-xl" "german:mgpt:mt5-xl" "pubmed:pubmedgpt:t5-11b" "xsum:gpt2-xl:t5-11b"; do
|
44 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M1=${P[1]} && M2=${P[2]}
|
45 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M1}_${M2} ...
|
46 |
+
python scripts/detect_gpt.py --scoring_model_name $M1 --mask_filling_model_name $M2 --n_perturbations 100 --dataset $D \
|
47 |
+
--dataset_file $data_path/${D}_${M1} --output_file $res_path/${D}_${M1}_${M2}
|
48 |
+
done
|
49 |
+
|
50 |
+
# evaluate Fast-DetectGPT
|
51 |
+
for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
|
52 |
+
IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
|
53 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}-${M} ...
|
54 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M \
|
55 |
+
--dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
56 |
+
done
|
temperature.sh
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_temperature
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum squad writing"
|
18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
for D in $datasets; do
|
22 |
+
for M in $source_models; do
|
23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_temperature --base_model_name $M --output_file $data_path/${D}_${M}
|
25 |
+
done
|
26 |
+
done
|
27 |
+
|
28 |
+
# White-box Setting
|
29 |
+
echo `date`, Evaluate models in the white-box setting:
|
30 |
+
|
31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
32 |
+
for D in $datasets; do
|
33 |
+
for M in $source_models; do
|
34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
37 |
+
|
38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
41 |
+
done
|
42 |
+
done
|
43 |
+
|
44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
45 |
+
for D in $datasets; do
|
46 |
+
for M in $source_models; do
|
47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
50 |
+
# we leverage DetectGPT to generate the perturbations
|
51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
54 |
+
done
|
55 |
+
done
|
56 |
+
|
57 |
+
|
58 |
+
# Black-box Setting
|
59 |
+
echo `date`, Evaluate models in the black-box setting:
|
60 |
+
scoring_models="gpt-neo-2.7B"
|
61 |
+
|
62 |
+
# evaluate Fast-DetectGPT
|
63 |
+
for D in $datasets; do
|
64 |
+
for M in $source_models; do
|
65 |
+
M1=gpt-j-6B # sampling model
|
66 |
+
for M2 in $scoring_models; do
|
67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
70 |
+
done
|
71 |
+
done
|
72 |
+
done
|
73 |
+
|
74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
75 |
+
for D in $datasets; do
|
76 |
+
for M in $source_models; do
|
77 |
+
M1=t5-3b # perturbation model
|
78 |
+
for M2 in $scoring_models; do
|
79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
82 |
+
# we leverage DetectGPT to generate the perturbations
|
83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
86 |
+
done
|
87 |
+
done
|
88 |
+
done
|
topk.sh
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_topk
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum squad writing"
|
18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
for D in $datasets; do
|
22 |
+
for M in $source_models; do
|
23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_k --base_model_name $M --output_file $data_path/${D}_${M}
|
25 |
+
done
|
26 |
+
done
|
27 |
+
|
28 |
+
# White-box Setting
|
29 |
+
echo `date`, Evaluate models in the white-box setting:
|
30 |
+
|
31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
32 |
+
for D in $datasets; do
|
33 |
+
for M in $source_models; do
|
34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
37 |
+
|
38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
41 |
+
done
|
42 |
+
done
|
43 |
+
|
44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
45 |
+
for D in $datasets; do
|
46 |
+
for M in $source_models; do
|
47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
50 |
+
# we leverage DetectGPT to generate the perturbations
|
51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
54 |
+
done
|
55 |
+
done
|
56 |
+
|
57 |
+
|
58 |
+
# Black-box Setting
|
59 |
+
echo `date`, Evaluate models in the black-box setting:
|
60 |
+
scoring_models="gpt-neo-2.7B"
|
61 |
+
|
62 |
+
# evaluate Fast-DetectGPT
|
63 |
+
for D in $datasets; do
|
64 |
+
for M in $source_models; do
|
65 |
+
M1=gpt-j-6B # sampling model
|
66 |
+
for M2 in $scoring_models; do
|
67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
70 |
+
done
|
71 |
+
done
|
72 |
+
done
|
73 |
+
|
74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
75 |
+
for D in $datasets; do
|
76 |
+
for M in $source_models; do
|
77 |
+
M1=t5-3b # perturbation model
|
78 |
+
for M2 in $scoring_models; do
|
79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
82 |
+
# we leverage DetectGPT to generate the perturbations
|
83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
86 |
+
done
|
87 |
+
done
|
88 |
+
done
|
topp.sh
ADDED
@@ -0,0 +1,88 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
#!/usr/bin/env bash
|
2 |
+
# Copyright (c) Guangsheng Bao.
|
3 |
+
#
|
4 |
+
# This source code is licensed under the MIT license found in the
|
5 |
+
# LICENSE file in the root directory of this source tree.
|
6 |
+
|
7 |
+
# setup the environment
|
8 |
+
echo `date`, Setup the environment ...
|
9 |
+
set -e # exit if error
|
10 |
+
|
11 |
+
# prepare folders
|
12 |
+
exp_path=exp_topp
|
13 |
+
data_path=$exp_path/data
|
14 |
+
res_path=$exp_path/results
|
15 |
+
mkdir -p $exp_path $data_path $res_path
|
16 |
+
|
17 |
+
datasets="xsum squad writing"
|
18 |
+
source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
|
19 |
+
|
20 |
+
# preparing dataset
|
21 |
+
for D in $datasets; do
|
22 |
+
for M in $source_models; do
|
23 |
+
echo `date`, Preparing dataset ${D}-${M} ...
|
24 |
+
python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_p --base_model_name $M --output_file $data_path/${D}_${M}
|
25 |
+
done
|
26 |
+
done
|
27 |
+
|
28 |
+
# White-box Setting
|
29 |
+
echo `date`, Evaluate models in the white-box setting:
|
30 |
+
|
31 |
+
# evaluate Fast-DetectGPT and fast baselines
|
32 |
+
for D in $datasets; do
|
33 |
+
for M in $source_models; do
|
34 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
|
35 |
+
python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
|
36 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
37 |
+
|
38 |
+
echo `date`, Evaluating baseline methods on ${D}_${M} ...
|
39 |
+
python scripts/baselines.py --scoring_model_name $M --dataset $D \
|
40 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
41 |
+
done
|
42 |
+
done
|
43 |
+
|
44 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
45 |
+
for D in $datasets; do
|
46 |
+
for M in $source_models; do
|
47 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M} ...
|
48 |
+
python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
|
49 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
|
50 |
+
# we leverage DetectGPT to generate the perturbations
|
51 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
|
52 |
+
python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
|
53 |
+
--dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
|
54 |
+
done
|
55 |
+
done
|
56 |
+
|
57 |
+
|
58 |
+
# Black-box Setting
|
59 |
+
echo `date`, Evaluate models in the black-box setting:
|
60 |
+
scoring_models="gpt-neo-2.7B"
|
61 |
+
|
62 |
+
# evaluate Fast-DetectGPT
|
63 |
+
for D in $datasets; do
|
64 |
+
for M in $source_models; do
|
65 |
+
M1=gpt-j-6B # sampling model
|
66 |
+
for M2 in $scoring_models; do
|
67 |
+
echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
68 |
+
python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
|
69 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
70 |
+
done
|
71 |
+
done
|
72 |
+
done
|
73 |
+
|
74 |
+
# evaluate DetectGPT and its improvement DetectLLM
|
75 |
+
for D in $datasets; do
|
76 |
+
for M in $source_models; do
|
77 |
+
M1=t5-3b # perturbation model
|
78 |
+
for M2 in $scoring_models; do
|
79 |
+
echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
|
80 |
+
python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
|
81 |
+
--dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
|
82 |
+
# we leverage DetectGPT to generate the perturbations
|
83 |
+
echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
|
84 |
+
python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
|
85 |
+
--dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
|
86 |
+
done
|
87 |
+
done
|
88 |
+
done
|