azra-kml commited on
Commit
aefc9ef
1 Parent(s): d5b0bd7

Upload 30 files

Browse files
Files changed (30) hide show
  1. LICENSE +21 -0
  2. README.md +80 -14
  3. attack.sh +85 -0
  4. baselines.py +137 -0
  5. custom_datasets.py +96 -0
  6. data_builder.py +276 -0
  7. data_truncator.py +97 -0
  8. detect_gpt.py +295 -0
  9. detect_llm.py +128 -0
  10. detector.py +11 -0
  11. dna_gpt.py +211 -0
  12. fast_detect_gpt.py +162 -0
  13. gpt3to4.sh +116 -0
  14. gptzero.py +84 -0
  15. index.html +106 -0
  16. local_infer.py +94 -0
  17. main.sh +97 -0
  18. main_ext.sh +89 -0
  19. metrics.py +26 -0
  20. model.py +79 -0
  21. paraphrasing.py +106 -0
  22. report_results.py +490 -0
  23. requirements.txt +8 -3
  24. setup.sh +1 -0
  25. show_result.py +51 -0
  26. supervised.py +78 -0
  27. supervised.sh +56 -0
  28. temperature.sh +88 -0
  29. topk.sh +88 -0
  30. topp.sh +88 -0
LICENSE ADDED
@@ -0,0 +1,21 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ MIT License
2
+
3
+ Copyright (c) 2023 Bao Guangsheng
4
+
5
+ Permission is hereby granted, free of charge, to any person obtaining a copy
6
+ of this software and associated documentation files (the "Software"), to deal
7
+ in the Software without restriction, including without limitation the rights
8
+ to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9
+ copies of the Software, and to permit persons to whom the Software is
10
+ furnished to do so, subject to the following conditions:
11
+
12
+ The above copyright notice and this permission notice shall be included in all
13
+ copies or substantial portions of the Software.
14
+
15
+ THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16
+ IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17
+ FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18
+ AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19
+ LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20
+ OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21
+ SOFTWARE.
README.md CHANGED
@@ -1,14 +1,80 @@
1
- ---
2
- title: Fast Detect Gpt
3
- emoji: 🏆
4
- colorFrom: indigo
5
- colorTo: red
6
- sdk: streamlit
7
- sdk_version: 1.41.0
8
- app_file: app.py
9
- pinned: false
10
- license: mit
11
- short_description: analiz
12
- ---
13
-
14
- Check out the configuration reference at https://huggingface.co/docs/hub/spaces-config-reference
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Fast-DetectGPT
2
+ **This code is for ICLR 2024 paper "Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature"**, where we borrow or extend some code from [DetectGPT](https://github.com/eric-mitchell/detect-gpt).
3
+
4
+ [Paper](https://arxiv.org/abs/2310.05130)
5
+ | [LocalDemo](#local-demo)
6
+ | [OnlineDemo](http://region-9.autodl.pro:21504/)
7
+ | [OpenReview](https://openreview.net/forum?id=Bpcgcr8E8Z)
8
+
9
+
10
+ ## Brief Intro
11
+ <table class="tg" style="padding-left: 30px;">
12
+ <tr>
13
+ <th class="tg-0pky">Method</th>
14
+ <th class="tg-0pky">5-Model Generations ↑</th>
15
+ <th class="tg-0pky">ChatGPT/GPT-4 Generations ↑</th>
16
+ <th class="tg-0pky">Speedup ↑</th>
17
+ </tr>
18
+ <tr>
19
+ <td class="tg-0pky">DetectGPT</td>
20
+ <td class="tg-0pky">0.9554</td>
21
+ <td class="tg-0pky">0.7225</td>
22
+ <td class="tg-0pky">1x</td>
23
+ </tr>
24
+ <tr>
25
+ <td class="tg-0pky">Fast-DetectGPT</td>
26
+ <td class="tg-0pky">0.9887 (relative↑ <b>74.7%</b>)</td>
27
+ <td class="tg-0pky">0.9338 (relative↑ <b>76.1%</b>)</td>
28
+ <td class="tg-0pky"><b>340x</b></td>
29
+ </tr>
30
+ </table>
31
+ The table shows detection accuracy (measured in AUROC) and computational speedup for machine-generated text detection. The <b>white-box setting</b> (directly using the source model) is used for detecting generations produced by five source models (5-model), whereas the <b>black-box
32
+ setting</b> (utilizing surrogate models) targets ChatGPT and GPT-4 generations. AUROC results are averaged across various datasets and source models. Speedup assessments were conducted on a Tesla A100 GPU.
33
+
34
+
35
+ ## Environment
36
+ * Python3.8
37
+ * PyTorch1.10.0
38
+ * Setup the environment:
39
+ ```bash setup.sh```
40
+
41
+ (Notes: our experiments are run on 1 GPU of Tesla A100 with 80G memory.)
42
+
43
+ ## Local Demo
44
+ Please run following command locally for an interactive demo:
45
+ ```
46
+ python scripts/local_infer.py
47
+ ```
48
+ where the default reference and sampling models are both gpt-neo-2.7B.
49
+
50
+ We could use gpt-j-6B as the reference model to obtain more accurate detections:
51
+ ```
52
+ python scripts/local_infer.py --reference_model_name gpt-j-6B
53
+ ```
54
+
55
+
56
+ An example (using gpt-j-6B as the reference model) looks like
57
+ ```
58
+ Please enter your text: (Press Enter twice to start processing)
59
+ Disguised as police, they broke through a fence on Monday evening and broke into the cargo of a Swiss-bound plane to take the valuable items. The audacious heist occurred at an airport in a small European country, leaving authorities baffled and airline officials in shock.
60
+
61
+ Fast-DetectGPT criterion is 1.9299, suggesting that the text has a probability of 87% to be machine-generated.
62
+ ```
63
+
64
+ ## Workspace
65
+ Following folders are created for our experiments:
66
+ * ./exp_main -> experiments for 5-model generations (main.sh).
67
+ * ./exp_gpt3to4 -> experiments for GPT-3, ChatGPT, and GPT-4 generations (gpt3to4.sh).
68
+
69
+ (Notes: we share <b>generations from GPT-3, ChatGPT, and GPT-4</b> in exp_gpt3to4/data for convenient reproduction.)
70
+
71
+ ### Citation
72
+ If you find this work useful, you can cite it with the following BibTex entry:
73
+
74
+ @inproceedings{bao2023fast,
75
+ title={Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature},
76
+ author={Bao, Guangsheng and Zhao, Yanbin and Teng, Zhiyang and Yang, Linyi and Zhang, Yue},
77
+ booktitle={The Twelfth International Conference on Learning Representations},
78
+ year={2023}
79
+ }
80
+
attack.sh ADDED
@@ -0,0 +1,85 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ para=t5 # "t5" for paraphrasing attack, or "random" for decoherence attack
13
+ exp_path=exp_attack
14
+ data_path=$exp_path/data
15
+ res_path=$exp_path/results
16
+ mkdir -p $exp_path $data_path $res_path
17
+
18
+ src_path=exp_gpt3to4
19
+ src_data_path=$src_path/data
20
+
21
+ datasets="xsum writing pubmed"
22
+ source_models="gpt-3.5-turbo"
23
+
24
+ # preparing dataset
25
+ for D in $datasets; do
26
+ for M in $source_models; do
27
+ echo `date`, Preparing dataset ${D}_${M} by paraphrasing ${src_data_path}/${D}_${M} ...
28
+ python scripts/paraphrasing.py --dataset $D --dataset_file $src_data_path/${D}_${M} \
29
+ --paraphraser $para --output_file $data_path/${D}_${M}
30
+ done
31
+ done
32
+
33
+ # evaluate Fast-DetectGPT in the black-box setting
34
+ settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
35
+ for D in $datasets; do
36
+ for M in $source_models; do
37
+ for S in $settings; do
38
+ IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
39
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
40
+ python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
41
+ --dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
42
+ done
43
+ done
44
+ done
45
+
46
+ # evaluate supervised detectors
47
+ supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
48
+ for D in $datasets; do
49
+ for M in $source_models; do
50
+ for SM in $supervised_models; do
51
+ echo `date`, Evaluating ${SM} on ${D}_${M} ...
52
+ python scripts/supervised.py --model_name $SM --dataset $D \
53
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
54
+ done
55
+ done
56
+ done
57
+
58
+ # evaluate fast baselines
59
+ scoring_models="gpt-neo-2.7B"
60
+ for D in $datasets; do
61
+ for M in $source_models; do
62
+ for M2 in $scoring_models; do
63
+ echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
64
+ python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
65
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
66
+ done
67
+ done
68
+ done
69
+
70
+ # evaluate DetectGPT and DetectLLM
71
+ scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
72
+ for D in $datasets; do
73
+ for M in $source_models; do
74
+ M1=t5-11b # perturbation model
75
+ for M2 in $scoring_models; do
76
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
77
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
78
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
79
+ # we leverage DetectGPT to generate the perturbations
80
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
81
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
82
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
83
+ done
84
+ done
85
+ done
baselines.py ADDED
@@ -0,0 +1,137 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ import numpy as np
7
+ import torch
8
+ import torch.nn.functional as F
9
+ import tqdm
10
+ import argparse
11
+ import json
12
+ from data_builder import load_data
13
+ from model import load_tokenizer, load_model
14
+ from metrics import get_roc_metrics, get_precision_recall_metrics
15
+
16
+ def get_likelihood(logits, labels):
17
+ assert logits.shape[0] == 1
18
+ assert labels.shape[0] == 1
19
+
20
+ logits = logits.view(-1, logits.shape[-1])
21
+ labels = labels.view(-1)
22
+ log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
23
+ log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
24
+ return log_likelihood.mean().item()
25
+
26
+ def get_rank(logits, labels):
27
+ assert logits.shape[0] == 1
28
+ assert labels.shape[0] == 1
29
+
30
+ # get rank of each label token in the model's likelihood ordering
31
+ matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
32
+ assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
33
+
34
+ ranks, timesteps = matches[:, -1], matches[:, -2]
35
+
36
+ # make sure we got exactly one match for each timestep in the sequence
37
+ assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
38
+
39
+ ranks = ranks.float() + 1 # convert to 1-indexed rank
40
+ return -ranks.mean().item()
41
+
42
+ def get_logrank(logits, labels):
43
+ assert logits.shape[0] == 1
44
+ assert labels.shape[0] == 1
45
+
46
+ # get rank of each label token in the model's likelihood ordering
47
+ matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
48
+ assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
49
+
50
+ ranks, timesteps = matches[:, -1], matches[:, -2]
51
+
52
+ # make sure we got exactly one match for each timestep in the sequence
53
+ assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
54
+
55
+ ranks = ranks.float() + 1 # convert to 1-indexed rank
56
+ ranks = torch.log(ranks)
57
+ return -ranks.mean().item()
58
+
59
+ def get_entropy(logits, labels):
60
+ assert logits.shape[0] == 1
61
+ assert labels.shape[0] == 1
62
+
63
+ entropy = F.softmax(logits, dim=-1) * F.log_softmax(logits, dim=-1)
64
+ entropy = -entropy.sum(-1)
65
+ return entropy.mean().item()
66
+
67
+
68
+ def experiment(args):
69
+ # load model
70
+ scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
71
+ scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
72
+ scoring_model.eval()
73
+ # load data
74
+ data = load_data(args.dataset_file)
75
+ n_samples = len(data["sampled"])
76
+ # eval criterions
77
+ criterion_fns = {'likelihood': get_likelihood,
78
+ 'rank': get_rank,
79
+ 'logrank': get_logrank,
80
+ 'entropy': get_entropy}
81
+ for name in criterion_fns:
82
+ criterion_fn = criterion_fns[name]
83
+ torch.manual_seed(args.seed)
84
+ np.random.seed(args.seed)
85
+ eval_results = []
86
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
87
+ original_text = data["original"][idx]
88
+ sampled_text = data["sampled"][idx]
89
+ # original text
90
+ tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
91
+ labels = tokenized.input_ids[:, 1:]
92
+ with torch.no_grad():
93
+ logits = scoring_model(**tokenized).logits[:, :-1]
94
+ original_crit = criterion_fn(logits, labels)
95
+ # sampled text
96
+ tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
97
+ labels = tokenized.input_ids[:, 1:]
98
+ with torch.no_grad():
99
+ logits = scoring_model(**tokenized).logits[:, :-1]
100
+ sampled_crit = criterion_fn(logits, labels)
101
+ # result
102
+ eval_results.append({"original": original_text,
103
+ "original_crit": original_crit,
104
+ "sampled": sampled_text,
105
+ "sampled_crit": sampled_crit})
106
+
107
+ # compute prediction scores for real/sampled passages
108
+ predictions = {'real': [x["original_crit"] for x in eval_results],
109
+ 'samples': [x["sampled_crit"] for x in eval_results]}
110
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
111
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
112
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
113
+ # log results
114
+ results_file = f'{args.output_file}.{name}.json'
115
+ results = { 'name': f'{name}_threshold',
116
+ 'info': {'n_samples': n_samples},
117
+ 'predictions': predictions,
118
+ 'raw_results': eval_results,
119
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
120
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
121
+ 'loss': 1 - pr_auc}
122
+ with open(results_file, 'w') as fout:
123
+ json.dump(results, fout)
124
+ print(f'Results written into {results_file}')
125
+
126
+ if __name__ == '__main__':
127
+ parser = argparse.ArgumentParser()
128
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
129
+ parser.add_argument('--dataset', type=str, default="xsum")
130
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
131
+ parser.add_argument('--scoring_model_name', type=str, default="gpt2")
132
+ parser.add_argument('--seed', type=int, default=0)
133
+ parser.add_argument('--device', type=str, default="cuda")
134
+ parser.add_argument('--cache_dir', type=str, default="../cache")
135
+ args = parser.parse_args()
136
+
137
+ experiment(args)
custom_datasets.py ADDED
@@ -0,0 +1,96 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import os.path
2
+ import random
3
+ import datasets
4
+
5
+ SEPARATOR = '<<<SEP>>>'
6
+
7
+
8
+ DATASETS = ['writing', 'english', 'german', 'pubmed']
9
+
10
+ def load_dataset(path, name=None, split=None, cache_dir=None):
11
+ # use local model if it exists
12
+ local_path = os.path.join(cache_dir, f'local.{path}_{name}_{split}')
13
+ if os.path.exists(local_path):
14
+ return datasets.load_from_disk(local_path)
15
+ return datasets.load_dataset(path, name, split=split, cache_dir=cache_dir)
16
+
17
+ def load_pubmed(cache_dir):
18
+ data = load_dataset('pubmed_qa', 'pqa_labeled', split='train', cache_dir=cache_dir)
19
+
20
+ # combine question and long_answer
21
+ data = [f'Question: {q} Answer:{SEPARATOR}{a}' for q, a in zip(data['question'], data['long_answer'])]
22
+
23
+ return data
24
+
25
+
26
+ def process_prompt(prompt):
27
+ return prompt.replace('[ WP ]', '').replace('[ OT ]', '')
28
+
29
+
30
+ def process_spaces(story):
31
+ return story.replace(
32
+ ' ,', ',').replace(
33
+ ' .', '.').replace(
34
+ ' ?', '?').replace(
35
+ ' !', '!').replace(
36
+ ' ;', ';').replace(
37
+ ' \'', '\'').replace(
38
+ ' ’ ', '\'').replace(
39
+ ' :', ':').replace(
40
+ '<newline>', '\n').replace(
41
+ '`` ', '"').replace(
42
+ ' \'\'', '"').replace(
43
+ '\'\'', '"').replace(
44
+ '.. ', '... ').replace(
45
+ ' )', ')').replace(
46
+ '( ', '(').replace(
47
+ ' n\'t', 'n\'t').replace(
48
+ ' i ', ' I ').replace(
49
+ ' i\'', ' I\'').replace(
50
+ '\\\'', '\'').replace(
51
+ '\n ', '\n').strip()
52
+
53
+
54
+ def load_writing(cache_dir=None):
55
+ writing_path = 'data/writingPrompts'
56
+
57
+ with open(f'{writing_path}/valid.wp_source', 'r') as f:
58
+ prompts = f.readlines()
59
+ with open(f'{writing_path}/valid.wp_target', 'r') as f:
60
+ stories = f.readlines()
61
+
62
+ prompts = [process_prompt(prompt) for prompt in prompts]
63
+ joined = [process_spaces(prompt + " " + story) for prompt, story in zip(prompts, stories)]
64
+ filtered = [story for story in joined if 'nsfw' not in story and 'NSFW' not in story]
65
+
66
+ random.seed(0)
67
+ random.shuffle(filtered)
68
+
69
+ return filtered
70
+
71
+
72
+ def load_language(language, cache_dir):
73
+ # load either the english or german portion of the wmt16 dataset
74
+ assert language in ['en', 'de']
75
+ d = load_dataset('wmt16', 'de-en', split='train', cache_dir=cache_dir)
76
+ docs = d['translation']
77
+ desired_language_docs = [d[language] for d in docs]
78
+ lens = [len(d.split()) for d in desired_language_docs]
79
+ sub = [d for d, l in zip(desired_language_docs, lens) if l > 100 and l < 150]
80
+ return sub
81
+
82
+
83
+ def load_german(cache_dir):
84
+ return load_language('de', cache_dir)
85
+
86
+
87
+ def load_english(cache_dir):
88
+ return load_language('en', cache_dir)
89
+
90
+
91
+ def load(name, cache_dir, **kwargs):
92
+ if name in DATASETS:
93
+ load_fn = globals()[f'load_{name}']
94
+ return load_fn(cache_dir=cache_dir, **kwargs)
95
+ else:
96
+ raise ValueError(f'Unknown dataset {name}')
data_builder.py ADDED
@@ -0,0 +1,276 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import time
6
+
7
+ import numpy as np
8
+ import datasets
9
+ import torch
10
+ import random
11
+ import argparse
12
+ import os
13
+ import json
14
+ import custom_datasets
15
+ from model import load_tokenizer, load_model
16
+
17
+
18
+ def save_data(output_file, args, data):
19
+ # write args to file
20
+ args_file = f"{output_file}.args.json"
21
+ with open(args_file, "w") as fout:
22
+ json.dump(args.__dict__, fout, indent=4)
23
+ print(f"Args written into {args_file}")
24
+
25
+ # write the data to a json file in the save folder
26
+ data_file = f"{output_file}.raw_data.json"
27
+ with open(data_file, "w") as fout:
28
+ json.dump(data, fout, indent=4)
29
+ print(f"Raw data written into {data_file}")
30
+
31
+
32
+ def load_data(input_file):
33
+ data_file = f"{input_file}.raw_data.json"
34
+ with open(data_file, "r") as fin:
35
+ data = json.load(fin)
36
+ print(f"Raw data loaded from {data_file}")
37
+ return data
38
+
39
+
40
+ class DataBuilder:
41
+ def __init__(self, args):
42
+ self.args = args
43
+ self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
44
+ self.base_model = None if args.openai_model else load_model(args.base_model_name, args.device, args.cache_dir)
45
+
46
+ def _openai_sample(self, prefix):
47
+ def _drop_last_word(text):
48
+ return ' '.join(text.split(' ')[:-1])
49
+
50
+ import openai
51
+ assert self.args.openai_key is not None, "Must provide OpenAI API key as --openai_key"
52
+ openai.api_key = self.args.openai_key
53
+ if self.args.openai_base is not None:
54
+ openai.api_base = self.args.openai_base
55
+
56
+ if self.args.dataset != 'pubmed': # keep Answer: prefix for pubmed
57
+ prefix = _drop_last_word(prefix)
58
+
59
+ # sample from the openai model
60
+ kwargs = {"max_tokens": 200}
61
+ if self.args.do_top_p:
62
+ kwargs['top_p'] = self.args.top_p
63
+ elif self.args.do_top_k:
64
+ kwargs['top_k'] = self.args.top_k
65
+ elif self.args.do_temperature:
66
+ kwargs['temperature'] = self.args.temperature
67
+
68
+ if self.args.openai_model == 'davinci':
69
+ kwargs["engine"] = self.args.openai_model
70
+ response = openai.Completion.create(prompt=f"{prefix}", **kwargs)
71
+ return prefix + response['choices'][0]['text']
72
+
73
+ elif self.args.openai_model in ['gpt-3.5-turbo', 'gpt-4']:
74
+ roles = {'xsum': 'You are a News writer.',
75
+ 'writing': 'You are a Fiction writer.',
76
+ 'pubmed': 'You are a Technical writer.'}
77
+ prompts = {'xsum': 'Please write an article with about 150 words starting exactly with:',
78
+ 'writing': 'Please write an article with about 150 words starting exactly with:',
79
+ 'pubmed': 'Please answer the question in about 50 words.'}
80
+ messages = [
81
+ {'role': 'system', 'content': roles[self.args.dataset]},
82
+ {'role': 'user', 'content': f'{prompts[self.args.dataset]} {prefix}'},
83
+ ]
84
+ kwargs["model"] = self.args.openai_model
85
+ kwargs["messages"] = messages
86
+ response = openai.ChatCompletion.create(**kwargs)
87
+ response = response['choices'][0]['message']['content']
88
+ # ChatGPT may repeat the prefix
89
+ if response.startswith(prefix[:20]):
90
+ return response
91
+ return prefix + ' ' + response
92
+
93
+ else:
94
+ raise NotImplementedError
95
+
96
+ # sample from base_model using ****only**** the first 30 tokens in each example as context
97
+ def _sample_from_model(self, texts, min_words=55, prompt_tokens=30):
98
+ # encode each text as a list of token ids
99
+ if self.args.dataset == 'pubmed':
100
+ texts = [t[:t.index(custom_datasets.SEPARATOR)] for t in texts]
101
+ all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
102
+ else:
103
+ all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True, return_token_type_ids=False).to(self.args.device)
104
+ all_encoded = {key: value[:, :prompt_tokens] for key, value in all_encoded.items()}
105
+
106
+ if self.args.openai_model:
107
+ # decode the prefixes back into text
108
+ prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
109
+
110
+ decoded = []
111
+ for idx, prefix in enumerate(prefixes):
112
+ while idx >= len(decoded):
113
+ try:
114
+ decoded.append(self._openai_sample(prefix))
115
+ except Exception as ex:
116
+ print(ex)
117
+ print('Wait 10 minutes before retry ...')
118
+ time.sleep(600)
119
+
120
+ else:
121
+ self.base_model.eval()
122
+ decoded = ['' for _ in range(len(texts))]
123
+
124
+ # sample from the model until we get a sample with at least min_words words for each example
125
+ # this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
126
+ tries = 0
127
+ m = 0
128
+ while m < min_words:
129
+ if tries != 0:
130
+ print()
131
+ print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
132
+ prefixes = self.base_tokenizer.batch_decode(all_encoded['input_ids'], skip_special_tokens=True)
133
+ for prefix, x in zip(prefixes, decoded):
134
+ if len(x.split()) == m:
135
+ print(prefix, '=>', x)
136
+
137
+ sampling_kwargs = {}
138
+ if self.args.do_top_p:
139
+ sampling_kwargs['top_p'] = self.args.top_p
140
+ elif self.args.do_top_k:
141
+ sampling_kwargs['top_k'] = self.args.top_k
142
+ elif self.args.do_temperature:
143
+ sampling_kwargs['temperature'] = self.args.temperature
144
+ min_length = 50 if self.args.dataset in ['pubmed'] else 150
145
+ outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
146
+ **sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
147
+ eos_token_id=self.base_tokenizer.eos_token_id)
148
+ decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
149
+ m = min(len(x.split()) for x in decoded)
150
+ tries += 1
151
+
152
+ return decoded
153
+
154
+ def generate_samples(self, raw_data, batch_size):
155
+ # trim to shorter length
156
+ def _trim_to_shorter_length(texta, textb):
157
+ # truncate to shorter of o and s
158
+ shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
159
+ texta = ' '.join(texta.split(' ')[:shorter_length])
160
+ textb = ' '.join(textb.split(' ')[:shorter_length])
161
+ return texta, textb
162
+
163
+ def _truncate_to_substring(text, substring, idx_occurrence):
164
+ # truncate everything after the idx_occurrence occurrence of substring
165
+ assert idx_occurrence > 0, 'idx_occurrence must be > 0'
166
+ idx = -1
167
+ for _ in range(idx_occurrence):
168
+ idx = text.find(substring, idx + 1)
169
+ if idx == -1:
170
+ return text
171
+ return text[:idx]
172
+
173
+ data = {
174
+ "original": [],
175
+ "sampled": [],
176
+ }
177
+
178
+ for batch in range(len(raw_data) // batch_size):
179
+ print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
180
+ original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
181
+ sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55)
182
+
183
+ for o, s in zip(original_text, sampled_text):
184
+ if self.args.dataset == 'pubmed':
185
+ s = _truncate_to_substring(s, 'Question:', 2)
186
+ o = o.replace(custom_datasets.SEPARATOR, ' ')
187
+
188
+ o, s = _trim_to_shorter_length(o, s)
189
+
190
+ # add to the data
191
+ data["original"].append(o)
192
+ data["sampled"].append(s)
193
+
194
+ return data
195
+
196
+ def generate_data(args, dataset, key):
197
+ # strip newlines from each example; replace one or more newlines with a single space
198
+ def _strip_newlines(text):
199
+ return ' '.join(text.split())
200
+
201
+ # load data
202
+ if dataset in custom_datasets.DATASETS:
203
+ data = custom_datasets.load(dataset, args.cache_dir)
204
+ else:
205
+ data = custom_datasets.load_dataset(dataset, split='train', cache_dir=args.cache_dir)[key]
206
+
207
+ # get unique examples, strip whitespace, and remove newlines
208
+ # then take just the long examples, shuffle, take the first 5,000 to tokenize to save time
209
+ # then take just the examples that are <= 512 tokens (for the base model)
210
+ # then generate n_samples samples
211
+
212
+ # remove duplicates from the data
213
+ data = list(dict.fromkeys(data)) # deterministic, as opposed to set()
214
+
215
+ # strip whitespace around each example
216
+ data = [x.strip() for x in data]
217
+
218
+ # remove newlines from each example
219
+ data = [_strip_newlines(x) for x in data]
220
+
221
+ # try to keep only examples with > 250 words
222
+ if dataset in ['writing', 'squad', 'xsum']:
223
+ long_data = [x for x in data if len(x.split()) > 250]
224
+ if len(long_data) > 0:
225
+ data = long_data
226
+
227
+ random.shuffle(data)
228
+ data = data[:5_000]
229
+
230
+ # keep only examples with <= 512 tokens according to base_tokenizer
231
+ # this step has the extra effect of removing examples with low-quality/garbage content
232
+ data_builder = DataBuilder(args)
233
+ tokenized_data = data_builder.base_tokenizer(data)
234
+ data = [x for x, y in zip(data, tokenized_data["input_ids"]) if len(y) <= 512]
235
+
236
+ # print stats about remaining data
237
+ print(f"Total number of samples: {len(data)}")
238
+ print(f"Average number of words: {np.mean([len(x.split()) for x in data])}")
239
+
240
+ return data_builder.generate_samples(data[:args.n_samples], batch_size=args.batch_size)
241
+
242
+ if __name__ == '__main__':
243
+ parser = argparse.ArgumentParser()
244
+ parser.add_argument('--output_file', type=str, default="./exp_gpt3/data/xsum_gpt2")
245
+ parser.add_argument('--dataset', type=str, default="xsum")
246
+ parser.add_argument('--n_samples', type=int, default=200)
247
+ parser.add_argument('--openai_base', type=str, default=None)
248
+ parser.add_argument('--openai_key', type=str, default=None)
249
+ parser.add_argument('--openai_model', type=str, default=None) # davinci, gpt-3.5-turbo, gpt-4
250
+ parser.add_argument('--base_model_name', type=str, default="gpt2")
251
+ parser.add_argument('--batch_size', type=int, default=50)
252
+ parser.add_argument('--do_top_k', action='store_true')
253
+ parser.add_argument('--top_k', type=int, default=40)
254
+ parser.add_argument('--do_top_p', action='store_true')
255
+ parser.add_argument('--top_p', type=float, default=0.96)
256
+ parser.add_argument('--do_temperature', action='store_true')
257
+ parser.add_argument('--temperature', type=float, default=0.8)
258
+ parser.add_argument('--seed', type=int, default=0)
259
+ parser.add_argument('--device', type=str, default="cuda")
260
+ parser.add_argument('--cache_dir', type=str, default="../cache")
261
+ args = parser.parse_args()
262
+
263
+ os.environ["XDG_CACHE_HOME"] = args.cache_dir
264
+ if not os.path.exists(args.cache_dir):
265
+ os.makedirs(args.cache_dir)
266
+ print(f"Using cache dir {args.cache_dir}")
267
+
268
+ random.seed(args.seed)
269
+ torch.manual_seed(args.seed)
270
+ np.random.seed(args.seed)
271
+
272
+ print(f'Loading dataset {args.dataset}...')
273
+ dataset_keys = {'xsum': 'document', 'squad': 'context', 'writing': 'document'}
274
+ data = generate_data(args, args.dataset, dataset_keys[args.dataset] if args.dataset in dataset_keys else None)
275
+
276
+ save_data(args.output_file, args, data)
data_truncator.py ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import time
6
+
7
+ import numpy as np
8
+ import datasets
9
+ import torch
10
+ import random
11
+ import argparse
12
+ import os
13
+ import json
14
+ import custom_datasets
15
+ from model import load_tokenizer, load_model
16
+
17
+ def stats_str(data):
18
+ if type(data) == dict:
19
+ mean_orig = np.mean([len(v.split()) for v in data['original']])
20
+ mean_samp = np.mean([len(v.split()) for v in data['sampled']])
21
+ return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled).'
22
+ else:
23
+ mean_orig = np.mean([len(v['original'].split()) for v in data])
24
+ mean_samp = np.mean([len(v['sampled'].split()) for v in data])
25
+ mean_perturb_orig = np.mean([np.mean([len(p.split()) for p in v['perturbed_original']]) for v in data])
26
+ mean_perturb_samp = np.mean([np.mean([len(p.split()) for p in v['perturbed_sampled']]) for v in data])
27
+ return f'{mean_orig:.0f} words (original), {mean_samp:.0f} words (sampled), {mean_perturb_orig:.0f} words (perturb original), {mean_perturb_samp:.0f} words (perturb sampled).'
28
+
29
+ def save_data(output_file, args, data):
30
+ # write args to file
31
+ args_file = f"{output_file}.args.json"
32
+ with open(args_file, "w") as fout:
33
+ json.dump(args, fout, indent=4)
34
+ print(f"Args written into {args_file}")
35
+
36
+ # write the data to a json file in the save folder
37
+ data_file = f"{output_file}.raw_data.json"
38
+ with open(data_file, "w") as fout:
39
+ json.dump(data, fout, indent=4)
40
+ print(f"Raw data written into {data_file}: {stats_str(data)}")
41
+
42
+
43
+ def load_data(input_file):
44
+ # load args from file
45
+ args_file = f"{input_file}.args.json"
46
+ with open(args_file, "r") as fin:
47
+ args = json.load(fin)
48
+ print(f"Args loaded from {args_file}")
49
+
50
+ # load the data from file
51
+ data_file = f"{input_file}.raw_data.json"
52
+ with open(data_file, "r") as fin:
53
+ data = json.load(fin)
54
+ print(f"Raw data loaded from {data_file}: {stats_str(data)}")
55
+
56
+ return args, data
57
+
58
+ def convert_data(input_file, output_file, max_words):
59
+ def _reduce(text):
60
+ lines = []
61
+ nwords = 0
62
+ for line in text.split('\n'):
63
+ if nwords >= max_words:
64
+ break
65
+ words = line.split()
66
+ words = words[:max_words - nwords]
67
+ lines.append(' '.join(words))
68
+ nwords += len(words)
69
+ return '\n'.join(lines)
70
+
71
+ args, data = load_data(input_file)
72
+ if type(data) == dict:
73
+ data['original'] = [_reduce(x) for x in data['original']]
74
+ data['sampled'] = [_reduce(x) for x in data['sampled']]
75
+ else:
76
+ for item in data:
77
+ item['original'] = _reduce(item['original'])
78
+ item['sampled'] = _reduce(item['sampled'])
79
+ item['perturbed_original'] = [_reduce(x) for x in item['perturbed_original']]
80
+ item['perturbed_sampled'] = [_reduce(x) for x in item['perturbed_sampled']]
81
+
82
+ save_data(output_file, args, data)
83
+
84
+ if __name__ == '__main__':
85
+ parser = argparse.ArgumentParser()
86
+ parser.add_argument('--input_path', type=str, default="./exp_gpt3to4/data/")
87
+ parser.add_argument('--output_path', type=str, default="./exp_maxlen150/data/")
88
+ parser.add_argument('--max_words', type=int, default=150)
89
+ args = parser.parse_args()
90
+
91
+ import glob
92
+ import os.path as path
93
+
94
+ for file_name in glob.glob(f'{args.input_path}/*.raw_data.json'):
95
+ print(file_name)
96
+ file_name = path.basename(file_name).replace('.raw_data.json', '')
97
+ convert_data(path.join(args.input_path, file_name), path.join(args.output_path, file_name), args.max_words)
detect_gpt.py ADDED
@@ -0,0 +1,295 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import os.path
6
+
7
+ import numpy as np
8
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
9
+ import re
10
+ import torch
11
+ import tqdm
12
+ import argparse
13
+ import json
14
+ from data_builder import load_data, save_data
15
+ from metrics import get_roc_metrics, get_precision_recall_metrics
16
+ from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
17
+
18
+ # define regex to match all <extra_id_*> tokens, where * is an integer
19
+ pattern = re.compile(r"<extra_id_\d+>")
20
+
21
+ def load_mask_model(model_name, device, cache_dir):
22
+ model_name = get_model_fullname(model_name)
23
+ # mask filling t5 model
24
+ print(f'Loading mask filling model {model_name}...')
25
+ mask_model = from_pretrained(AutoModelForSeq2SeqLM, model_name, {}, cache_dir)
26
+ mask_model = mask_model.to(device)
27
+ return mask_model
28
+
29
+ def load_mask_tokenizer(model_name, max_length, cache_dir):
30
+ model_name = get_model_fullname(model_name)
31
+ tokenizer = from_pretrained(AutoTokenizer, model_name, {'model_max_length': max_length}, cache_dir)
32
+ return tokenizer
33
+
34
+ def tokenize_and_mask(text, span_length, pct, ceil_pct=False):
35
+ buffer_size = 1
36
+ tokens = text.split(' ')
37
+ mask_string = '<<<mask>>>'
38
+
39
+ n_spans = pct * len(tokens) / (span_length + buffer_size * 2)
40
+ if ceil_pct:
41
+ n_spans = np.ceil(n_spans)
42
+ n_spans = int(n_spans)
43
+
44
+ n_masks = 0
45
+ while n_masks < n_spans:
46
+ start = np.random.randint(0, len(tokens) - span_length)
47
+ end = start + span_length
48
+ search_start = max(0, start - buffer_size)
49
+ search_end = min(len(tokens), end + buffer_size)
50
+ if mask_string not in tokens[search_start:search_end]:
51
+ tokens[start:end] = [mask_string]
52
+ n_masks += 1
53
+
54
+ # replace each occurrence of mask_string with <extra_id_NUM>, where NUM increments
55
+ num_filled = 0
56
+ for idx, token in enumerate(tokens):
57
+ if token == mask_string:
58
+ tokens[idx] = f'<extra_id_{num_filled}>'
59
+ num_filled += 1
60
+ assert num_filled == n_masks, f"num_filled {num_filled} != n_masks {n_masks}"
61
+ text = ' '.join(tokens)
62
+ return text
63
+
64
+ def count_masks(texts):
65
+ return [len([x for x in text.split() if x.startswith("<extra_id_")]) for text in texts]
66
+
67
+ # replace each masked span with a sample from T5 mask_model
68
+ def replace_masks(args, mask_model, mask_tokenizer, texts):
69
+ n_expected = count_masks(texts)
70
+ stop_id = mask_tokenizer.encode(f"<extra_id_{max(n_expected)}>")[0]
71
+ tokens = mask_tokenizer(texts, return_tensors="pt", padding=True).to(args.device)
72
+ outputs = mask_model.generate(**tokens, max_length=150, do_sample=True, top_p=args.mask_top_p,
73
+ num_return_sequences=1, eos_token_id=stop_id)
74
+ return mask_tokenizer.batch_decode(outputs, skip_special_tokens=False)
75
+
76
+ def extract_fills(texts):
77
+ # remove <pad> from beginning of each text
78
+ texts = [x.replace("<pad>", "").replace("</s>", "").strip() for x in texts]
79
+
80
+ # return the text in between each matched mask token
81
+ extracted_fills = [pattern.split(x)[1:-1] for x in texts]
82
+
83
+ # remove whitespace around each fill
84
+ extracted_fills = [[y.strip() for y in x] for x in extracted_fills]
85
+
86
+ return extracted_fills
87
+
88
+ def apply_extracted_fills(masked_texts, extracted_fills):
89
+ # split masked text into tokens, only splitting on spaces (not newlines)
90
+ tokens = [x.split(' ') for x in masked_texts]
91
+
92
+ n_expected = count_masks(masked_texts)
93
+
94
+ # replace each mask token with the corresponding fill
95
+ for idx, (text, fills, n) in enumerate(zip(tokens, extracted_fills, n_expected)):
96
+ if len(fills) < n:
97
+ tokens[idx] = []
98
+ else:
99
+ for fill_idx in range(n):
100
+ text[text.index(f"<extra_id_{fill_idx}>")] = fills[fill_idx]
101
+
102
+ # join tokens back into text
103
+ texts = [" ".join(x) for x in tokens]
104
+ return texts
105
+
106
+ def perturb_texts_(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
107
+ span_length = args.span_length
108
+ pct = args.pct_words_masked
109
+ masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for x in texts]
110
+ raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
111
+ extracted_fills = extract_fills(raw_fills)
112
+ perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
113
+
114
+ # Handle the fact that sometimes the model doesn't generate the right number of fills and we have to try again
115
+ attempts = 1
116
+ while '' in perturbed_texts:
117
+ idxs = [idx for idx, x in enumerate(perturbed_texts) if x == '']
118
+ print(f'WARNING: {len(idxs)} texts have no fills. Trying again [attempt {attempts}].')
119
+ masked_texts = [tokenize_and_mask(x, span_length, pct, ceil_pct) for idx, x in enumerate(texts) if idx in idxs]
120
+ raw_fills = replace_masks(args, mask_model, mask_tokenizer, masked_texts)
121
+ extracted_fills = extract_fills(raw_fills)
122
+ new_perturbed_texts = apply_extracted_fills(masked_texts, extracted_fills)
123
+ for idx, x in zip(idxs, new_perturbed_texts):
124
+ perturbed_texts[idx] = x
125
+ attempts += 1
126
+ return perturbed_texts
127
+
128
+ def perturb_texts(args, mask_model, mask_tokenizer, texts, ceil_pct=False):
129
+ chunk_size = 10
130
+ outputs = []
131
+ for i in range(0, len(texts), chunk_size):
132
+ outputs.extend(perturb_texts_(args, mask_model, mask_tokenizer, texts[i:i + chunk_size], ceil_pct=ceil_pct))
133
+ return outputs
134
+
135
+ # Get the log likelihood of each text under the base_model
136
+ def get_ll(args, scoring_model, scoring_tokenizer, text):
137
+ with torch.no_grad():
138
+ tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
139
+ labels = tokenized.input_ids
140
+ return -scoring_model(**tokenized, labels=labels).loss.item()
141
+
142
+ def get_lls(args, scoring_model, scoring_tokenizer, texts):
143
+ return [get_ll(args, scoring_model, scoring_tokenizer, text) for text in texts]
144
+
145
+
146
+ def generate_perturbs(args):
147
+ n_perturbations = args.n_perturbations
148
+ name = f'perturbation_{n_perturbations}'
149
+ # load model
150
+ mask_model = load_mask_model(args.mask_filling_model_name, args.device, args.cache_dir)
151
+ mask_model.eval()
152
+ try:
153
+ n_positions = mask_model.config.n_positions
154
+ except AttributeError:
155
+ n_positions = 512
156
+ mask_tokenizer = load_mask_tokenizer(args.mask_filling_model_name, n_positions, args.cache_dir)
157
+
158
+ # load data
159
+ data = load_data(args.dataset_file)
160
+ n_samples = len(data["sampled"])
161
+
162
+ torch.manual_seed(args.seed)
163
+ np.random.seed(args.seed)
164
+
165
+ # generate perturb samples
166
+ perturbs = []
167
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Perturb text"):
168
+ original_text = data["original"][idx]
169
+ sampled_text = data["sampled"][idx]
170
+ # perturb
171
+ p_sampled_text = perturb_texts(args, mask_model, mask_tokenizer, [sampled_text for _ in range(n_perturbations)])
172
+ p_original_text = perturb_texts(args, mask_model, mask_tokenizer, [original_text for _ in range(n_perturbations)])
173
+ assert len(p_sampled_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_sampled_text)}"
174
+ assert len(p_original_text) == n_perturbations, f"Expected {n_perturbations} perturbed samples, got {len(p_original_text)}"
175
+ # result
176
+ perturbs.append({
177
+ "original": original_text,
178
+ "sampled": sampled_text,
179
+ "perturbed_sampled": p_sampled_text,
180
+ "perturbed_original": p_original_text
181
+ })
182
+
183
+ save_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}', args, perturbs)
184
+
185
+
186
+ def experiment(args):
187
+ n_perturbations = args.n_perturbations
188
+ name = f'perturbation_{n_perturbations}'
189
+ perturb_file = f'{args.dataset_file}.{args.mask_filling_model_name}.{name}.raw_data.json'
190
+ if os.path.exists(perturb_file):
191
+ print(f'Use existing perturbation file: {perturb_file}')
192
+ else:
193
+ generate_perturbs(args)
194
+ # load model
195
+ scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
196
+ scoring_model = load_model(args.scoring_model_name, 'cpu', args.cache_dir)
197
+ scoring_model.eval()
198
+ scoring_model.to(args.device)
199
+ # load data
200
+ data = load_data(f'{args.dataset_file}.{args.mask_filling_model_name}.{name}')
201
+ n_samples = len(data)
202
+
203
+ torch.manual_seed(args.seed)
204
+ np.random.seed(args.seed)
205
+
206
+ # Evaluate
207
+ results = data
208
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
209
+ original_text = results[idx]["original"]
210
+ sampled_text = results[idx]["sampled"]
211
+ perturbed_original = results[idx]["perturbed_original"]
212
+ perturbed_sampled = results[idx]["perturbed_sampled"]
213
+ # original text
214
+ original_ll = get_ll(args, scoring_model, scoring_tokenizer, original_text)
215
+ p_original_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_original)
216
+ # sampled text
217
+ sampled_ll = get_ll(args, scoring_model, scoring_tokenizer, sampled_text)
218
+ p_sampled_ll = get_lls(args, scoring_model, scoring_tokenizer, perturbed_sampled)
219
+ # result
220
+ results[idx]["original_ll"] = original_ll
221
+ results[idx]["sampled_ll"] = sampled_ll
222
+ results[idx]["all_perturbed_sampled_ll"] = p_sampled_ll
223
+ results[idx]["all_perturbed_original_ll"] = p_original_ll
224
+ results[idx]["perturbed_sampled_ll"] = np.mean(p_sampled_ll)
225
+ results[idx]["perturbed_original_ll"] = np.mean(p_original_ll)
226
+ results[idx]["perturbed_sampled_ll_std"] = np.std(p_sampled_ll) if len(p_sampled_ll) > 1 else 1
227
+ results[idx]["perturbed_original_ll_std"] = np.std(p_original_ll) if len(p_original_ll) > 1 else 1
228
+
229
+ # compute diffs with perturbed
230
+ predictions = {'real': [], 'samples': []}
231
+ for res in results:
232
+ if res['perturbed_original_ll_std'] == 0:
233
+ res['perturbed_original_ll_std'] = 1
234
+ print("WARNING: std of perturbed original is 0, setting to 1")
235
+ print(f"Number of unique perturbed original texts: {len(set(res['perturbed_original']))}")
236
+ print(f"Original text: {res['original']}")
237
+ if res['perturbed_sampled_ll_std'] == 0:
238
+ res['perturbed_sampled_ll_std'] = 1
239
+ print("WARNING: std of perturbed sampled is 0, setting to 1")
240
+ print(f"Number of unique perturbed sampled texts: {len(set(res['perturbed_sampled']))}")
241
+ print(f"Sampled text: {res['sampled']}")
242
+ predictions['real'].append((res['original_ll'] - res['perturbed_original_ll']) / res['perturbed_original_ll_std'])
243
+ predictions['samples'].append((res['sampled_ll'] - res['perturbed_sampled_ll']) / res['perturbed_sampled_ll_std'])
244
+
245
+ print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
246
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
247
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
248
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
249
+
250
+ # results
251
+ results_file = f'{args.output_file}.{name}.json'
252
+ results = {
253
+ 'name': name,
254
+ 'info': {
255
+ 'pct_words_masked': args.pct_words_masked,
256
+ 'span_length': args.span_length,
257
+ 'n_perturbations': args.n_perturbations,
258
+ 'n_samples': n_samples,
259
+ },
260
+ 'predictions': predictions,
261
+ 'raw_results': results,
262
+ 'metrics': {
263
+ 'roc_auc': roc_auc,
264
+ 'fpr': fpr,
265
+ 'tpr': tpr,
266
+ },
267
+ 'pr_metrics': {
268
+ 'pr_auc': pr_auc,
269
+ 'precision': p,
270
+ 'recall': r,
271
+ },
272
+ 'loss': 1 - pr_auc,
273
+ }
274
+ with open(results_file, 'w') as fout:
275
+ json.dump(results, fout)
276
+ print(f'Results written into {results_file}')
277
+
278
+
279
+ if __name__ == '__main__':
280
+ parser = argparse.ArgumentParser()
281
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
282
+ parser.add_argument('--dataset', type=str, default="xsum")
283
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
284
+ parser.add_argument('--pct_words_masked', type=float, default=0.3) # pct masked is actually pct_words_masked * (span_length / (span_length + 2 * buffer_size))
285
+ parser.add_argument('--mask_top_p', type=float, default=1.0)
286
+ parser.add_argument('--span_length', type=int, default=2)
287
+ parser.add_argument('--n_perturbations', type=int, default=10)
288
+ parser.add_argument('--scoring_model_name', type=str, default="gpt2")
289
+ parser.add_argument('--mask_filling_model_name', type=str, default="t5-small")
290
+ parser.add_argument('--seed', type=int, default=0)
291
+ parser.add_argument('--device', type=str, default="cuda")
292
+ parser.add_argument('--cache_dir', type=str, default="../cache")
293
+ args = parser.parse_args()
294
+
295
+ experiment(args)
detect_llm.py ADDED
@@ -0,0 +1,128 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ import numpy as np
7
+ import torch
8
+ import torch.nn.functional as F
9
+ import tqdm
10
+ import argparse
11
+ import json
12
+ from model import load_tokenizer, load_model
13
+ from metrics import get_roc_metrics, get_precision_recall_metrics
14
+ from data_builder import load_data
15
+
16
+ def get_likelihood(logits, labels):
17
+ assert logits.shape[0] == 1
18
+ assert labels.shape[0] == 1
19
+
20
+ logits = logits.view(-1, logits.shape[-1])
21
+ labels = labels.view(-1)
22
+ log_probs = torch.nn.functional.log_softmax(logits, dim=-1)
23
+ log_likelihood = log_probs.gather(dim=-1, index=labels.unsqueeze(-1)).squeeze(-1)
24
+ return log_likelihood.mean().item()
25
+
26
+ def get_logrank(logits, labels):
27
+ assert logits.shape[0] == 1
28
+ assert labels.shape[0] == 1
29
+
30
+ # get rank of each label token in the model's likelihood ordering
31
+ matches = (logits.argsort(-1, descending=True) == labels.unsqueeze(-1)).nonzero()
32
+ assert matches.shape[1] == 3, f"Expected 3 dimensions in matches tensor, got {matches.shape}"
33
+
34
+ ranks, timesteps = matches[:, -1], matches[:, -2]
35
+
36
+ # make sure we got exactly one match for each timestep in the sequence
37
+ assert (timesteps == torch.arange(len(timesteps)).to(timesteps.device)).all(), "Expected one match per timestep"
38
+
39
+ ranks = ranks.float() + 1 # convert to 1-indexed rank
40
+ ranks = torch.log(ranks)
41
+ return ranks.mean().item()
42
+
43
+ # Log-Likelihood Log-Rank Ratio
44
+ def get_lrr(args, scoring_model, scoring_tokenizer, text, perturbs):
45
+ with torch.no_grad():
46
+ tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
47
+ labels = tokenized.input_ids[:, 1:]
48
+ logits = scoring_model(**tokenized).logits[:, :-1]
49
+ likelihood = get_likelihood(logits, labels)
50
+ logrank = get_logrank(logits, labels)
51
+ return - likelihood / logrank
52
+
53
+ # Normalized Log-Rank Perturbation
54
+ def get_npr(args, scoring_model, scoring_tokenizer, text, perturbs):
55
+ with torch.no_grad():
56
+ tokenized = scoring_tokenizer(text, return_tensors="pt", return_token_type_ids=False).to(args.device)
57
+ labels = tokenized.input_ids[:, 1:]
58
+ logits = scoring_model(**tokenized).logits[:, :-1]
59
+ logrank = get_logrank(logits, labels)
60
+ # perturbations
61
+ logranks = []
62
+ for perturb in perturbs:
63
+ tokenized = scoring_tokenizer(perturb, return_tensors="pt", return_token_type_ids=False).to(args.device)
64
+ labels = tokenized.input_ids[:, 1:]
65
+ logits = scoring_model(**tokenized).logits[:, :-1]
66
+ logranks.append(get_logrank(logits, labels))
67
+ # npr
68
+ return np.mean(logranks) / logrank
69
+
70
+ def experiment(args):
71
+ # load model
72
+ scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
73
+ scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
74
+ scoring_model.eval()
75
+ # load data
76
+ data = load_data(args.dataset_file)
77
+ n_samples = len(data)
78
+ # eval criterions
79
+ criterion_fns = {'lrr': get_lrr, 'npr': get_npr}
80
+ for name in criterion_fns:
81
+ criterion_fn = criterion_fns[name]
82
+ torch.manual_seed(args.seed)
83
+ np.random.seed(args.seed)
84
+ eval_results = []
85
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
86
+ original_text = data[idx]["original"]
87
+ sampled_text = data[idx]["sampled"]
88
+ perturbed_original = data[idx]["perturbed_original"]
89
+ perturbed_sampled = data[idx]["perturbed_sampled"]
90
+ original_crit = criterion_fn(args, scoring_model, scoring_tokenizer, original_text, perturbed_original)
91
+ sampled_crit = criterion_fn(args, scoring_model, scoring_tokenizer, sampled_text, perturbed_sampled)
92
+ # result
93
+ eval_results.append({"original": original_text,
94
+ "original_crit": original_crit,
95
+ "sampled": sampled_text,
96
+ "sampled_crit": sampled_crit})
97
+
98
+ # compute prediction scores for real/sampled passages
99
+ predictions = {'real': [x["original_crit"] for x in eval_results],
100
+ 'samples': [x["sampled_crit"] for x in eval_results]}
101
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
102
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
103
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
104
+ # log results
105
+ results_file = f'{args.output_file}.{name}.json'
106
+ results = { 'name': f'{name}_threshold',
107
+ 'info': {'n_samples': n_samples},
108
+ 'predictions': predictions,
109
+ 'raw_results': eval_results,
110
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
111
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
112
+ 'loss': 1 - pr_auc}
113
+ with open(results_file, 'w') as fout:
114
+ json.dump(results, fout)
115
+ print(f'Results written into {results_file}')
116
+
117
+ if __name__ == '__main__':
118
+ parser = argparse.ArgumentParser()
119
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
120
+ parser.add_argument('--dataset', type=str, default="xsum")
121
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/results/xsum_gpt2.perturbation_10")
122
+ parser.add_argument('--scoring_model_name', type=str, default="gpt2")
123
+ parser.add_argument('--seed', type=int, default=0)
124
+ parser.add_argument('--device', type=str, default="cuda")
125
+ parser.add_argument('--cache_dir', type=str, default="../cache")
126
+ args = parser.parse_args()
127
+
128
+ experiment(args)
detector.py ADDED
@@ -0,0 +1,11 @@
 
 
 
 
 
 
 
 
 
 
 
 
1
+ class Detector:
2
+ def __init__(self):
3
+ # Model veya gerekli dosyaları yüklemek için burada yapılandırma yapabilirsiniz
4
+ print("Fast-DetectGPT initialized!")
5
+
6
+ def detect(self, text):
7
+ """
8
+ Verilen metni analiz eder ve sonuç döndürür.
9
+ """
10
+ # Gerçek analiz işlemi yerine örnek sonuç döndürülüyor
11
+ return [(text, 0.85)] # 0.85 AI tarafından üretilmiş olasılığıdır
dna_gpt.py ADDED
@@ -0,0 +1,211 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import os.path
6
+
7
+ import numpy as np
8
+ from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
9
+ import re
10
+ import torch
11
+ import tqdm
12
+ import argparse
13
+ import json
14
+ from data_builder import load_data, save_data
15
+ from metrics import get_roc_metrics, get_precision_recall_metrics
16
+ from model import load_tokenizer, load_model, get_model_fullname, from_pretrained
17
+ from data_builder import load_data
18
+ from model import load_tokenizer, load_model
19
+ from metrics import get_roc_metrics, get_precision_recall_metrics
20
+ import custom_datasets
21
+
22
+ class PrefixSampler:
23
+ def __init__(self, args):
24
+ self.args = args
25
+ self.base_tokenizer = load_tokenizer(args.base_model_name, args.dataset, args.cache_dir)
26
+ self.base_model = load_model(args.base_model_name, args.device, args.cache_dir)
27
+
28
+ def _sample_from_model(self, texts, min_words=55, truncate_ratio=0.5):
29
+ # encode each text as a list of token ids
30
+ if self.args.dataset == 'pubmed':
31
+ pubmed_sep = ' Answer:'
32
+ texts = [t[:t.index(pubmed_sep) + len(pubmed_sep)] for t in texts]
33
+ all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
34
+ else:
35
+ texts = [t.split(' ') for t in texts]
36
+ texts = [' '.join(t[: int(len(t) * truncate_ratio)]) for t in texts]
37
+ all_encoded = self.base_tokenizer(texts, return_tensors="pt", padding=True).to(self.args.device)
38
+
39
+ self.base_model.eval()
40
+ decoded = ['' for _ in range(len(texts))]
41
+
42
+ # sample from the model until we get a sample with at least min_words words for each example
43
+ # this is an inefficient way to do this (since we regenerate for all inputs if just one is too short), but it works
44
+ tries = 0
45
+ m = 0
46
+ while m < min_words:
47
+ if tries != 0:
48
+ print()
49
+ print(f"min words: {m}, needed {min_words}, regenerating (try {tries})")
50
+
51
+ sampling_kwargs = {'temperature': self.args.temperature}
52
+ if self.args.do_top_p:
53
+ sampling_kwargs['top_p'] = self.args.top_p
54
+ elif self.args.do_top_k:
55
+ sampling_kwargs['top_k'] = self.args.top_k
56
+ min_length = 50 if self.args.dataset in ['pubmed'] else 150
57
+ outputs = self.base_model.generate(**all_encoded, min_length=min_length, max_length=200, do_sample=True,
58
+ **sampling_kwargs, pad_token_id=self.base_tokenizer.eos_token_id,
59
+ eos_token_id=self.base_tokenizer.eos_token_id)
60
+ decoded = self.base_tokenizer.batch_decode(outputs, skip_special_tokens=True)
61
+ m = min(len(x.split()) for x in decoded)
62
+ tries += 1
63
+
64
+ return decoded
65
+
66
+ def generate_samples(self, raw_data, batch_size):
67
+ # trim to shorter length
68
+ def _trim_to_shorter_length(texta, textb):
69
+ # truncate to shorter of o and s
70
+ shorter_length = min(len(texta.split(' ')), len(textb.split(' ')))
71
+ texta = ' '.join(texta.split(' ')[:shorter_length])
72
+ textb = ' '.join(textb.split(' ')[:shorter_length])
73
+ return texta, textb
74
+
75
+ def _truncate_to_substring(text, substring, idx_occurrence):
76
+ # truncate everything after the idx_occurrence occurrence of substring
77
+ assert idx_occurrence > 0, 'idx_occurrence must be > 0'
78
+ idx = -1
79
+ for _ in range(idx_occurrence):
80
+ idx = text.find(substring, idx + 1)
81
+ if idx == -1:
82
+ return text
83
+ return text[:idx]
84
+
85
+ data = {
86
+ "original": [],
87
+ "sampled": [],
88
+ }
89
+
90
+ assert len(raw_data) % batch_size == 0
91
+ for batch in range(len(raw_data) // batch_size):
92
+ print('Generating samples for batch', batch, 'of', len(raw_data) // batch_size)
93
+ original_text = raw_data[batch * batch_size:(batch + 1) * batch_size]
94
+ sampled_text = self._sample_from_model(original_text, min_words=30 if self.args.dataset in ['pubmed'] else 55, truncate_ratio=self.args.truncate_ratio)
95
+
96
+ for o, s in zip(original_text, sampled_text):
97
+ if self.args.dataset == 'pubmed':
98
+ s = _truncate_to_substring(s, 'Question:', 2)
99
+ o = o.replace(custom_datasets.SEPARATOR, ' ')
100
+
101
+ o, s = _trim_to_shorter_length(o, s)
102
+
103
+ # add to the data
104
+ data["original"].append(o)
105
+ data["sampled"].append(s)
106
+
107
+ return data
108
+
109
+ def get_likelihood(logits, labels, pad_index):
110
+ labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
111
+ lprobs = torch.log_softmax(logits, dim=-1)
112
+ log_likelihood = lprobs.gather(dim=-1, index=labels)
113
+ mask = labels != pad_index
114
+ log_likelihood = (log_likelihood * mask).sum(dim=1) / mask.sum(dim=1)
115
+ return log_likelihood.squeeze(-1)
116
+
117
+ def get_log_prob(sampler, text):
118
+ tokenized = sampler.base_tokenizer(text, return_tensors="pt", padding=True).to(sampler.args.device)
119
+ labels = tokenized.input_ids[:, 1:]
120
+ with torch.no_grad():
121
+ logits_score = sampler.base_model(**tokenized).logits[:, :-1]
122
+ return get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
123
+
124
+ def get_log_probs(sampler, texts):
125
+ batch_size = sampler.args.batch_size
126
+ batch_lprobs = []
127
+ for batch in range(len(texts) // batch_size):
128
+ tokenized = sampler.base_tokenizer(texts[batch * batch_size:(batch + 1) * batch_size], return_tensors="pt", padding=True).to(sampler.args.device)
129
+ labels = tokenized.input_ids[:, 1:]
130
+ with torch.no_grad():
131
+ logits_score = sampler.base_model(**tokenized).logits[:, :-1]
132
+ lprobs = get_likelihood(logits_score, labels, sampler.base_tokenizer.pad_token_id)
133
+ batch_lprobs.append(lprobs)
134
+ return torch.cat(batch_lprobs, dim=0)
135
+
136
+ def get_regen_samples(sampler, text):
137
+ data = [text] * sampler.args.regen_number
138
+ data = sampler.generate_samples(data, batch_size=sampler.args.batch_size)
139
+ return data['sampled']
140
+
141
+ def get_dna_gpt(sampler, text):
142
+ lprob = get_log_prob(sampler, text)
143
+ regens = get_regen_samples(sampler, text)
144
+ lprob_regens = get_log_probs(sampler, regens)
145
+ wscore = lprob[0] - lprob_regens.mean()
146
+ return wscore.item()
147
+
148
+ def experiment(args):
149
+ sampler = PrefixSampler(args)
150
+ # load data
151
+ data = load_data(args.dataset_file)
152
+ n_samples = len(data["sampled"])
153
+ # evaluate criterion
154
+ name = "dna_gpt"
155
+ criterion_fn = get_dna_gpt
156
+
157
+ torch.manual_seed(args.seed)
158
+ np.random.seed(args.seed)
159
+ results = []
160
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
161
+ original_text = data["original"][idx]
162
+ sampled_text = data["sampled"][idx]
163
+ # original text
164
+ original_crit = criterion_fn(sampler, original_text)
165
+ # sampled text
166
+ sampled_crit = criterion_fn(sampler, sampled_text)
167
+ # result
168
+ results.append({"original": original_text,
169
+ "original_crit": original_crit,
170
+ "sampled": sampled_text,
171
+ "sampled_crit": sampled_crit})
172
+
173
+ # compute prediction scores for real/sampled passages
174
+ predictions = {'real': [x["original_crit"] for x in results],
175
+ 'samples': [x["sampled_crit"] for x in results]}
176
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
177
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
178
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
179
+ # results
180
+ results_file = f'{args.output_file}.{name}.json'
181
+ results = { 'name': f'{name}_threshold',
182
+ 'info': {'n_samples': n_samples},
183
+ 'predictions': predictions,
184
+ 'raw_results': results,
185
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
186
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
187
+ 'loss': 1 - pr_auc}
188
+ with open(results_file, 'w') as fout:
189
+ json.dump(results, fout)
190
+ print(f'Results written into {results_file}')
191
+
192
+ if __name__ == '__main__':
193
+ parser = argparse.ArgumentParser()
194
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/pubmed_davinci")
195
+ parser.add_argument('--dataset', type=str, default="pubmed")
196
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/pubmed_davinci")
197
+ parser.add_argument('--truncate_ratio', type=float, default=0.5)
198
+ parser.add_argument('--regen_number', type=int, default=10)
199
+ parser.add_argument('--base_model_name', type=str, default="gpt2")
200
+ parser.add_argument('--batch_size', type=int, default=10)
201
+ parser.add_argument('--do_top_k', action='store_true')
202
+ parser.add_argument('--top_k', type=int, default=40)
203
+ parser.add_argument('--do_top_p', action='store_true')
204
+ parser.add_argument('--top_p', type=float, default=0.96)
205
+ parser.add_argument('--temperature', type=float, default=1.0)
206
+ parser.add_argument('--seed', type=int, default=0)
207
+ parser.add_argument('--device', type=str, default="cuda")
208
+ parser.add_argument('--cache_dir', type=str, default="../cache")
209
+ args = parser.parse_args()
210
+
211
+ experiment(args)
fast_detect_gpt.py ADDED
@@ -0,0 +1,162 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import random
6
+
7
+ import numpy as np
8
+ import torch
9
+ import torch.nn.functional as F
10
+ import tqdm
11
+ import argparse
12
+ import json
13
+ from data_builder import load_data
14
+ from model import load_tokenizer, load_model
15
+ from metrics import get_roc_metrics, get_precision_recall_metrics
16
+
17
+ def get_samples(logits, labels):
18
+ assert logits.shape[0] == 1
19
+ assert labels.shape[0] == 1
20
+ nsamples = 10000
21
+ lprobs = torch.log_softmax(logits, dim=-1)
22
+ distrib = torch.distributions.categorical.Categorical(logits=lprobs)
23
+ samples = distrib.sample([nsamples]).permute([1, 2, 0])
24
+ return samples
25
+
26
+ def get_likelihood(logits, labels):
27
+ assert logits.shape[0] == 1
28
+ assert labels.shape[0] == 1
29
+ labels = labels.unsqueeze(-1) if labels.ndim == logits.ndim - 1 else labels
30
+ lprobs = torch.log_softmax(logits, dim=-1)
31
+ log_likelihood = lprobs.gather(dim=-1, index=labels)
32
+ return log_likelihood.mean(dim=1)
33
+
34
+ def get_sampling_discrepancy(logits_ref, logits_score, labels):
35
+ assert logits_ref.shape[0] == 1
36
+ assert logits_score.shape[0] == 1
37
+ assert labels.shape[0] == 1
38
+ if logits_ref.size(-1) != logits_score.size(-1):
39
+ # print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
40
+ vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
41
+ logits_ref = logits_ref[:, :, :vocab_size]
42
+ logits_score = logits_score[:, :, :vocab_size]
43
+
44
+ samples = get_samples(logits_ref, labels)
45
+ log_likelihood_x = get_likelihood(logits_score, labels)
46
+ log_likelihood_x_tilde = get_likelihood(logits_score, samples)
47
+ miu_tilde = log_likelihood_x_tilde.mean(dim=-1)
48
+ sigma_tilde = log_likelihood_x_tilde.std(dim=-1)
49
+ discrepancy = (log_likelihood_x.squeeze(-1) - miu_tilde) / sigma_tilde
50
+ return discrepancy.item()
51
+
52
+ def get_sampling_discrepancy_analytic(logits_ref, logits_score, labels):
53
+ assert logits_ref.shape[0] == 1
54
+ assert logits_score.shape[0] == 1
55
+ assert labels.shape[0] == 1
56
+ if logits_ref.size(-1) != logits_score.size(-1):
57
+ # print(f"WARNING: vocabulary size mismatch {logits_ref.size(-1)} vs {logits_score.size(-1)}.")
58
+ vocab_size = min(logits_ref.size(-1), logits_score.size(-1))
59
+ logits_ref = logits_ref[:, :, :vocab_size]
60
+ logits_score = logits_score[:, :, :vocab_size]
61
+
62
+ labels = labels.unsqueeze(-1) if labels.ndim == logits_score.ndim - 1 else labels
63
+ lprobs_score = torch.log_softmax(logits_score, dim=-1)
64
+ probs_ref = torch.softmax(logits_ref, dim=-1)
65
+ log_likelihood = lprobs_score.gather(dim=-1, index=labels).squeeze(-1)
66
+ mean_ref = (probs_ref * lprobs_score).sum(dim=-1)
67
+ var_ref = (probs_ref * torch.square(lprobs_score)).sum(dim=-1) - torch.square(mean_ref)
68
+ discrepancy = (log_likelihood.sum(dim=-1) - mean_ref.sum(dim=-1)) / var_ref.sum(dim=-1).sqrt()
69
+ discrepancy = discrepancy.mean()
70
+ return discrepancy.item()
71
+
72
+ def experiment(args):
73
+ # load model
74
+ scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
75
+ scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
76
+ scoring_model.eval()
77
+ if args.reference_model_name != args.scoring_model_name:
78
+ reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
79
+ reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
80
+ reference_model.eval()
81
+ # load data
82
+ data = load_data(args.dataset_file)
83
+ n_samples = len(data["sampled"])
84
+ # evaluate criterion
85
+ if args.discrepancy_analytic:
86
+ name = "sampling_discrepancy_analytic"
87
+ criterion_fn = get_sampling_discrepancy_analytic
88
+ else:
89
+ name = "sampling_discrepancy"
90
+ criterion_fn = get_sampling_discrepancy
91
+
92
+ random.seed(args.seed)
93
+ torch.manual_seed(args.seed)
94
+ np.random.seed(args.seed)
95
+ results = []
96
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
97
+ original_text = data["original"][idx]
98
+ sampled_text = data["sampled"][idx]
99
+ # original text
100
+ tokenized = scoring_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
101
+ labels = tokenized.input_ids[:, 1:]
102
+ with torch.no_grad():
103
+ logits_score = scoring_model(**tokenized).logits[:, :-1]
104
+ if args.reference_model_name == args.scoring_model_name:
105
+ logits_ref = logits_score
106
+ else:
107
+ tokenized = reference_tokenizer(original_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
108
+ assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
109
+ logits_ref = reference_model(**tokenized).logits[:, :-1]
110
+ original_crit = criterion_fn(logits_ref, logits_score, labels)
111
+ # sampled text
112
+ tokenized = scoring_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
113
+ labels = tokenized.input_ids[:, 1:]
114
+ with torch.no_grad():
115
+ logits_score = scoring_model(**tokenized).logits[:, :-1]
116
+ if args.reference_model_name == args.scoring_model_name:
117
+ logits_ref = logits_score
118
+ else:
119
+ tokenized = reference_tokenizer(sampled_text, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
120
+ assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
121
+ logits_ref = reference_model(**tokenized).logits[:, :-1]
122
+ sampled_crit = criterion_fn(logits_ref, logits_score, labels)
123
+ # result
124
+ results.append({"original": original_text,
125
+ "original_crit": original_crit,
126
+ "sampled": sampled_text,
127
+ "sampled_crit": sampled_crit})
128
+
129
+ # compute prediction scores for real/sampled passages
130
+ predictions = {'real': [x["original_crit"] for x in results],
131
+ 'samples': [x["sampled_crit"] for x in results]}
132
+ print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
133
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
134
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
135
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
136
+ # results
137
+ results_file = f'{args.output_file}.{name}.json'
138
+ results = { 'name': f'{name}_threshold',
139
+ 'info': {'n_samples': n_samples},
140
+ 'predictions': predictions,
141
+ 'raw_results': results,
142
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
143
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
144
+ 'loss': 1 - pr_auc}
145
+ with open(results_file, 'w') as fout:
146
+ json.dump(results, fout)
147
+ print(f'Results written into {results_file}')
148
+
149
+ if __name__ == '__main__':
150
+ parser = argparse.ArgumentParser()
151
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
152
+ parser.add_argument('--dataset', type=str, default="xsum")
153
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
154
+ parser.add_argument('--reference_model_name', type=str, default="gpt2")
155
+ parser.add_argument('--scoring_model_name', type=str, default="gpt2")
156
+ parser.add_argument('--discrepancy_analytic', action='store_true')
157
+ parser.add_argument('--seed', type=int, default=0)
158
+ parser.add_argument('--device', type=str, default="cuda")
159
+ parser.add_argument('--cache_dir', type=str, default="../cache")
160
+ args = parser.parse_args()
161
+
162
+ experiment(args)
gpt3to4.sh ADDED
@@ -0,0 +1,116 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_gpt3to4
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum writing pubmed"
18
+ source_models="davinci gpt-3.5-turbo gpt-4"
19
+
20
+ # preparing dataset
21
+ openai_base="https://api.openai.com/v1"
22
+ openai_key="xxxxxxxx" # replace with your own key for generating your own test set
23
+
24
+ # We follow DetectGPT settings for generating text from GPT-3
25
+ M=davinci
26
+ for D in $datasets; do
27
+ echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
28
+ python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
29
+ --dataset $D --n_samples 150 --do_top_p --top_p 0.9 --batch_size 1 \
30
+ --output_file $data_path/${D}_${M}
31
+ done
32
+
33
+ # We use a temperature of 0.8 for creativity writing
34
+ for M in gpt-3.5-turbo gpt-4; do
35
+ for D in $datasets; do
36
+ echo `date`, Preparing dataset ${D} by sampling from openai/${M} ...
37
+ python scripts/data_builder.py --openai_model $M --openai_key $openai_key --openai_base $openai_base \
38
+ --dataset $D --n_samples 150 --do_temperature --temperature 0.8 --batch_size 1 \
39
+ --output_file $data_path/${D}_${M}
40
+ done
41
+ done
42
+
43
+ # evaluate Fast-DetectGPT in the black-box setting
44
+ settings="gpt-j-6B:gpt2-xl gpt-j-6B:gpt-neo-2.7B gpt-j-6B:gpt-j-6B"
45
+ for M in $source_models; do
46
+ for D in $datasets; do
47
+ for S in $settings; do
48
+ IFS=':' read -r -a S <<< $S && M1=${S[0]} && M2=${S[1]}
49
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
50
+ python scripts/fast_detect_gpt.py --reference_model_name $M1 --scoring_model_name $M2 --discrepancy_analytic \
51
+ --dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
52
+ done
53
+ done
54
+ done
55
+
56
+ # evaluate supervised detectors
57
+ supervised_models="roberta-base-openai-detector roberta-large-openai-detector"
58
+ for M in $source_models; do
59
+ for D in $datasets; do
60
+ for SM in $supervised_models; do
61
+ echo `date`, Evaluating ${SM} on ${D}_${M} ...
62
+ python scripts/supervised.py --model_name $SM --dataset $D \
63
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
64
+ done
65
+ done
66
+ done
67
+
68
+ # evaluate baselines
69
+ scoring_models="gpt-neo-2.7B"
70
+ for M in $source_models; do
71
+ for D in $datasets; do
72
+ for M2 in $scoring_models; do
73
+ echo `date`, Evaluating baseline methods on ${D}_${M}.${M2} ...
74
+ python scripts/baselines.py --scoring_model_name ${M2} --dataset $D \
75
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
76
+ done
77
+ done
78
+ done
79
+
80
+ # evaluate DNA-GPT
81
+ scoring_models="gpt-neo-2.7B"
82
+ for M in $source_models; do
83
+ for D in $datasets; do
84
+ for M2 in $scoring_models; do
85
+ echo `date`, Evaluating DNA-GPT on ${D}_${M}.${M2} ...
86
+ python scripts/dna_gpt.py --base_model_name ${M2} --dataset $D \
87
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M2}
88
+ done
89
+ done
90
+ done
91
+
92
+ # evaluate DetectGPT and DetectLLM
93
+ scoring_models="gpt2-xl gpt-neo-2.7B gpt-j-6B"
94
+ for M in $source_models; do
95
+ for D in $datasets; do
96
+ M1=t5-11b # perturbation model
97
+ for M2 in $scoring_models; do
98
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
99
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
100
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
101
+ # we leverage DetectGPT to generate the perturbations
102
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
103
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
104
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
105
+ done
106
+ done
107
+ done
108
+
109
+ # evaluate GPTZero
110
+ for M in $source_models; do
111
+ for D in $datasets; do
112
+ echo `date`, Evaluating GPTZero on ${D}_${M} ...
113
+ python scripts/gptzero.py --dataset $D \
114
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
115
+ done
116
+ done
gptzero.py ADDED
@@ -0,0 +1,84 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import time
6
+
7
+ import numpy as np
8
+ import tqdm
9
+ import argparse
10
+ import json
11
+ from metrics import get_roc_metrics, get_precision_recall_metrics
12
+ from data_builder import load_data
13
+
14
+ def detect_gptzero(args, text):
15
+ import requests
16
+ url = "https://api.gptzero.me/v2/predict/text"
17
+ payload = {
18
+ "document": text,
19
+ "version": "2023-09-14"
20
+ }
21
+ headers = {
22
+ "Accept": "application/json",
23
+ "content-type": "application/json",
24
+ "x-api-key": ""
25
+ }
26
+
27
+ while True:
28
+ try:
29
+ time.sleep(600) # 1 request per 10 minutes for free access
30
+ response = requests.post(url, json=payload, headers=headers)
31
+ return response.json()['documents'][0]['completely_generated_prob']
32
+ except Exception as ex:
33
+ print(ex)
34
+
35
+ def experiment(args):
36
+ # load data
37
+ data = load_data(args.dataset_file)
38
+ n_samples = len(data["sampled"])
39
+ # evaluate criterion
40
+ name = "gptzero"
41
+ criterion_fn = detect_gptzero
42
+
43
+ results = []
44
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
45
+ original_text = data["original"][idx]
46
+ sampled_text = data["sampled"][idx]
47
+ original_crit = criterion_fn(args, original_text)
48
+ sampled_crit = criterion_fn(args, sampled_text)
49
+ # result
50
+ results.append({"original": original_text,
51
+ "original_crit": original_crit,
52
+ "sampled": sampled_text,
53
+ "sampled_crit": sampled_crit})
54
+
55
+ # compute prediction scores for real/sampled passages
56
+ predictions = {'real': [x["original_crit"] for x in results],
57
+ 'samples': [x["sampled_crit"] for x in results]}
58
+ print(f"Real mean/std: {np.mean(predictions['real']):.2f}/{np.std(predictions['real']):.2f}, Samples mean/std: {np.mean(predictions['samples']):.2f}/{np.std(predictions['samples']):.2f}")
59
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
60
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
61
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
62
+
63
+ # results
64
+ results_file = f'{args.output_file}.{name}.json'
65
+ results = { 'name': f'{name}_threshold',
66
+ 'info': {'n_samples': n_samples},
67
+ 'predictions': predictions,
68
+ 'raw_results': results,
69
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
70
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
71
+ 'loss': 1 - pr_auc}
72
+ with open(results_file, 'w') as fout:
73
+ json.dump(results, fout)
74
+ print(f'Results written into {results_file}')
75
+
76
+ if __name__ == '__main__':
77
+ parser = argparse.ArgumentParser()
78
+ parser.add_argument('--output_file', type=str, default="./exp_gpt3to4/results/xsum_gpt-4")
79
+ parser.add_argument('--dataset', type=str, default="xsum")
80
+ parser.add_argument('--dataset_file', type=str, default="./exp_gpt3to4/data/xsum_gpt-4")
81
+ args = parser.parse_args()
82
+
83
+ experiment(args)
84
+
index.html ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <!DOCTYPE html>
2
+ <html lang="en">
3
+ <head>
4
+ <meta charset="UTF-8">
5
+ <meta name="viewport" content="width=device-width, initial-scale=1.0">
6
+ <title>Fast-DetectGPT</title>
7
+ <style>
8
+ body {
9
+ font-family: Arial, sans-serif;
10
+ margin: 20px;
11
+ background-color: #f9f9f9;
12
+ }
13
+ .container {
14
+ max-width: 700px;
15
+ margin: auto;
16
+ background: #ffffff;
17
+ border-radius: 8px;
18
+ padding: 20px;
19
+ box-shadow: 0 4px 8px rgba(0, 0, 0, 0.2);
20
+ }
21
+ h1 {
22
+ text-align: center;
23
+ color: #333;
24
+ }
25
+ textarea {
26
+ width: 100%;
27
+ height: 150px;
28
+ margin: 15px 0;
29
+ padding: 10px;
30
+ border: 1px solid #ccc;
31
+ border-radius: 5px;
32
+ font-size: 16px;
33
+ }
34
+ button {
35
+ display: block;
36
+ width: 100%;
37
+ padding: 10px;
38
+ background-color: #007bff;
39
+ color: white;
40
+ border: none;
41
+ border-radius: 5px;
42
+ font-size: 16px;
43
+ cursor: pointer;
44
+ }
45
+ button:hover {
46
+ background-color: #0056b3;
47
+ }
48
+ #result {
49
+ margin-top: 20px;
50
+ padding: 15px;
51
+ background-color: #f1f1f1;
52
+ border: 1px solid #ddd;
53
+ border-radius: 5px;
54
+ }
55
+ .error {
56
+ color: red;
57
+ }
58
+ </style>
59
+ </head>
60
+ <body>
61
+ <div class="container">
62
+ <h1>Fast-DetectGPT</h1>
63
+ <form id="analyzeForm">
64
+ <textarea name="text" placeholder="Enter your text here..." required></textarea>
65
+ <button type="submit">Analyze</button>
66
+ </form>
67
+ <div id="result"></div>
68
+ </div>
69
+
70
+ <script>
71
+ document.getElementById('analyzeForm').addEventListener('submit', function (e) {
72
+ e.preventDefault(); // Formun varsayılan davranışını durdurur.
73
+ const formData = new FormData(this);
74
+ const resultDiv = document.getElementById('result');
75
+
76
+ // Önce sonucu temizle
77
+ resultDiv.textContent = '';
78
+
79
+ // POST isteği gönder
80
+ fetch('/analyze', {
81
+ method: 'POST',
82
+ headers: {
83
+ 'Content-Type': 'application/json',
84
+ },
85
+ body: JSON.stringify({
86
+ text: formData.get('text'),
87
+ }),
88
+ })
89
+ .then(response => response.json())
90
+ .then(data => {
91
+ if (data.error) {
92
+ resultDiv.innerHTML = `<p class="error">Error: ${data.error}</p>`;
93
+ } else {
94
+ resultDiv.innerHTML = `
95
+ <p><strong>Criterion:</strong> ${data.criterion}</p>
96
+ <p><strong>Probability of being machine-generated:</strong> ${data.probability_machine_generated}</p>
97
+ `;
98
+ }
99
+ })
100
+ .catch(err => {
101
+ resultDiv.innerHTML = `<p class="error">An error occurred: ${err.message}</p>`;
102
+ });
103
+ });
104
+ </script>
105
+ </body>
106
+ </html>
local_infer.py ADDED
@@ -0,0 +1,94 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import random
6
+
7
+ import numpy as np
8
+ import torch
9
+ import os
10
+ import glob
11
+ import argparse
12
+ import json
13
+ from scripts.model import load_tokenizer, load_model
14
+ from scripts.fast_detect_gpt import get_sampling_discrepancy_analytic
15
+
16
+
17
+ # estimate the probability according to the distribution of our test results on ChatGPT and GPT-4
18
+ class ProbEstimator:
19
+ def __init__(self, args):
20
+ self.real_crits = []
21
+ self.fake_crits = []
22
+ for result_file in glob.glob(os.path.join(args.ref_path, '*.json')):
23
+ with open(result_file, 'r') as fin:
24
+ res = json.load(fin)
25
+ self.real_crits.extend(res['predictions']['real'])
26
+ self.fake_crits.extend(res['predictions']['samples'])
27
+ print(f'ProbEstimator: total {len(self.real_crits) * 2} samples.')
28
+
29
+
30
+ def crit_to_prob(self, crit):
31
+ offset = np.sort(np.abs(np.array(self.real_crits + self.fake_crits) - crit))[100]
32
+ cnt_real = np.sum((np.array(self.real_crits) > crit - offset) & (np.array(self.real_crits) < crit + offset))
33
+ cnt_fake = np.sum((np.array(self.fake_crits) > crit - offset) & (np.array(self.fake_crits) < crit + offset))
34
+ return cnt_fake / (cnt_real + cnt_fake)
35
+
36
+ # run interactive local inference
37
+ def run(args):
38
+ # load model
39
+ scoring_tokenizer = load_tokenizer(args.scoring_model_name, args.dataset, args.cache_dir)
40
+ scoring_model = load_model(args.scoring_model_name, args.device, args.cache_dir)
41
+ scoring_model.eval()
42
+ if args.reference_model_name != args.scoring_model_name:
43
+ reference_tokenizer = load_tokenizer(args.reference_model_name, args.dataset, args.cache_dir)
44
+ reference_model = load_model(args.reference_model_name, args.device, args.cache_dir)
45
+ reference_model.eval()
46
+ # evaluate criterion
47
+ name = "sampling_discrepancy_analytic"
48
+ criterion_fn = get_sampling_discrepancy_analytic
49
+ prob_estimator = ProbEstimator(args)
50
+ # input text
51
+ print('Local demo for Fast-DetectGPT, where the longer text has more reliable result.')
52
+ print('')
53
+ while True:
54
+ print("Please enter your text: (Press Enter twice to start processing)")
55
+ lines = []
56
+ while True:
57
+ line = input()
58
+ if len(line) == 0:
59
+ break
60
+ lines.append(line)
61
+ text = "\n".join(lines)
62
+ if len(text) == 0:
63
+ break
64
+ # evaluate text
65
+ tokenized = scoring_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
66
+ labels = tokenized.input_ids[:, 1:]
67
+ with torch.no_grad():
68
+ logits_score = scoring_model(**tokenized).logits[:, :-1]
69
+ if args.reference_model_name == args.scoring_model_name:
70
+ logits_ref = logits_score
71
+ else:
72
+ tokenized = reference_tokenizer(text, truncation=True, return_tensors="pt", padding=True, return_token_type_ids=False).to(args.device)
73
+ assert torch.all(tokenized.input_ids[:, 1:] == labels), "Tokenizer is mismatch."
74
+ logits_ref = reference_model(**tokenized).logits[:, :-1]
75
+ crit = criterion_fn(logits_ref, logits_score, labels)
76
+ # estimate the probability of machine generated text
77
+ prob = prob_estimator.crit_to_prob(crit)
78
+ print(f'Fast-DetectGPT criterion is {crit:.4f}, suggesting that the text has a probability of {prob * 100:.0f}% to be machine-generated.')
79
+ print()
80
+
81
+ if __name__ == '__main__':
82
+ parser = argparse.ArgumentParser()
83
+ parser.add_argument('--reference_model_name', type=str, default="gpt-neo-2.7B") # use gpt-j-6B for more accurate detection
84
+ parser.add_argument('--scoring_model_name', type=str, default="gpt-neo-2.7B")
85
+ parser.add_argument('--dataset', type=str, default="xsum")
86
+ parser.add_argument('--ref_path', type=str, default="./local_infer_ref")
87
+ parser.add_argument('--device', type=str, default="cuda")
88
+ parser.add_argument('--cache_dir', type=str, default="../cache")
89
+ args = parser.parse_args()
90
+
91
+ run(args)
92
+
93
+
94
+
main.sh ADDED
@@ -0,0 +1,97 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_main
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum squad writing"
18
+ source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
19
+
20
+ # preparing dataset
21
+ for D in $datasets; do
22
+ for M in $source_models; do
23
+ echo `date`, Preparing dataset ${D}_${M} ...
24
+ python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
25
+ done
26
+ done
27
+
28
+ # White-box Setting
29
+ echo `date`, Evaluate models in the white-box setting:
30
+
31
+ # evaluate Fast-DetectGPT and fast baselines
32
+ for D in $datasets; do
33
+ for M in $source_models; do
34
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
35
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
36
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
37
+
38
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
39
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
40
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
41
+ done
42
+ done
43
+
44
+ # evaluate DNA-GPT
45
+ for D in $datasets; do
46
+ for M in $source_models; do
47
+ echo `date`, Evaluating DNA-GPT on ${D}_${M} ...
48
+ python scripts/dna_gpt.py --base_model_name $M --dataset $D \
49
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
50
+ done
51
+ done
52
+
53
+ # evaluate DetectGPT and its improvement DetectLLM
54
+ for D in $datasets; do
55
+ for M in $source_models; do
56
+ echo `date`, Evaluating DetectGPT on ${D}_${M} ...
57
+ python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
58
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
59
+ # we leverage DetectGPT to generate the perturbations
60
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
61
+ python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
62
+ --dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
63
+ done
64
+ done
65
+
66
+
67
+ # Black-box Setting
68
+ echo `date`, Evaluate models in the black-box setting:
69
+ scoring_models="gpt-neo-2.7B"
70
+
71
+ # evaluate Fast-DetectGPT
72
+ for D in $datasets; do
73
+ for M in $source_models; do
74
+ M1=gpt-j-6B # sampling model
75
+ for M2 in $scoring_models; do
76
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
77
+ python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
78
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
79
+ done
80
+ done
81
+ done
82
+
83
+ # evaluate DetectGPT and its improvement DetectLLM
84
+ for D in $datasets; do
85
+ for M in $source_models; do
86
+ M1=t5-3b # perturbation model
87
+ for M2 in $scoring_models; do
88
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
89
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
90
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
91
+ # we leverage DetectGPT to generate the perturbations
92
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
93
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
94
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
95
+ done
96
+ done
97
+ done
main_ext.sh ADDED
@@ -0,0 +1,89 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_main_ext
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum squad writing"
18
+ source_models="bloom-7b1 opt-13b llama-13b llama2-13b"
19
+
20
+ # preparing dataset
21
+ for D in $datasets; do
22
+ for M in $source_models; do
23
+ echo `date`, Preparing dataset ${D}_${M} ...
24
+ python scripts/data_builder.py --dataset $D --n_samples 500 --base_model_name $M --output_file $data_path/${D}_${M}
25
+ done
26
+ done
27
+ exit
28
+
29
+ # White-box Setting
30
+ echo `date`, Evaluate models in the white-box setting:
31
+
32
+ # evaluate Fast-DetectGPT and fast baselines
33
+ for D in $datasets; do
34
+ for M in $source_models; do
35
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
36
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
37
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
38
+
39
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
40
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
41
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
42
+ done
43
+ done
44
+
45
+ # evaluate DetectGPT and its improvement DetectLLM
46
+ for D in $datasets; do
47
+ for M in $source_models; do
48
+ echo `date`, Evaluating DetectGPT on ${D}_${M} ...
49
+ python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
50
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
51
+ # we leverage DetectGPT to generate the perturbations
52
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
53
+ python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
54
+ --dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
55
+ done
56
+ done
57
+
58
+
59
+ # Black-box Setting
60
+ echo `date`, Evaluate models in the black-box setting:
61
+ scoring_models="gpt-neo-2.7B"
62
+
63
+ # evaluate Fast-DetectGPT
64
+ for D in $datasets; do
65
+ for M in $source_models; do
66
+ M1=gpt-j-6B # sampling model
67
+ for M2 in $scoring_models; do
68
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
69
+ python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
70
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
71
+ done
72
+ done
73
+ done
74
+
75
+ # evaluate DetectGPT and its improvement DetectLLM
76
+ for D in $datasets; do
77
+ for M in $source_models; do
78
+ M1=t5-3b # perturbation model
79
+ for M2 in $scoring_models; do
80
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
81
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
82
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
83
+ # we leverage DetectGPT to generate the perturbations
84
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
85
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
86
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
87
+ done
88
+ done
89
+ done
metrics.py ADDED
@@ -0,0 +1,26 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ import matplotlib.pyplot as plt
7
+ from sklearn.metrics import roc_curve, precision_recall_curve, auc
8
+
9
+ # 15 colorblind-friendly colors
10
+ COLORS = ["#0072B2", "#009E73", "#D55E00", "#CC79A7", "#F0E442",
11
+ "#56B4E9", "#E69F00", "#000000", "#0072B2", "#009E73",
12
+ "#D55E00", "#CC79A7", "#F0E442", "#56B4E9", "#E69F00"]
13
+
14
+
15
+ def get_roc_metrics(real_preds, sample_preds):
16
+ fpr, tpr, _ = roc_curve([0] * len(real_preds) + [1] * len(sample_preds), real_preds + sample_preds)
17
+ roc_auc = auc(fpr, tpr)
18
+ return fpr.tolist(), tpr.tolist(), float(roc_auc)
19
+
20
+
21
+ def get_precision_recall_metrics(real_preds, sample_preds):
22
+ precision, recall, _ = precision_recall_curve([0] * len(real_preds) + [1] * len(sample_preds),
23
+ real_preds + sample_preds)
24
+ pr_auc = auc(recall, precision)
25
+ return precision.tolist(), recall.tolist(), float(pr_auc)
26
+
model.py ADDED
@@ -0,0 +1,79 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ from transformers import AutoModelForCausalLM, AutoTokenizer
7
+ import torch
8
+ import time
9
+ import os
10
+
11
+ def from_pretrained(cls, model_name, kwargs, cache_dir):
12
+ # use local model if it exists
13
+ local_path = os.path.join(cache_dir, 'local.' + model_name.replace("/", "_"))
14
+ if os.path.exists(local_path):
15
+ return cls.from_pretrained(local_path, **kwargs)
16
+ return cls.from_pretrained(model_name, **kwargs, cache_dir=cache_dir)
17
+
18
+ # predefined models
19
+ model_fullnames = { 'gpt2': 'gpt2',
20
+ 'gpt2-xl': 'gpt2-xl',
21
+ 'opt-2.7b': 'facebook/opt-2.7b',
22
+ 'gpt-neo-2.7B': 'EleutherAI/gpt-neo-2.7B',
23
+ 'gpt-j-6B': 'EleutherAI/gpt-j-6B',
24
+ 'gpt-neox-20b': 'EleutherAI/gpt-neox-20b',
25
+ 'mgpt': 'sberbank-ai/mGPT',
26
+ 'pubmedgpt': 'stanford-crfm/pubmedgpt',
27
+ 'mt5-xl': 'google/mt5-xl',
28
+ 'llama-13b': 'huggyllama/llama-13b',
29
+ 'llama2-13b': 'TheBloke/Llama-2-13B-fp16',
30
+ 'bloom-7b1': 'bigscience/bloom-7b1',
31
+ 'opt-13b': 'facebook/opt-13b',
32
+ }
33
+ float16_models = ['gpt-j-6B', 'gpt-neox-20b', 'llama-13b', 'llama2-13b', 'bloom-7b1', 'opt-13b']
34
+
35
+ def get_model_fullname(model_name):
36
+ return model_fullnames[model_name] if model_name in model_fullnames else model_name
37
+
38
+ def load_model(model_name, device, cache_dir):
39
+ model_fullname = get_model_fullname(model_name)
40
+ print(f'Loading model {model_fullname}...')
41
+ model_kwargs = {}
42
+ if model_name in float16_models:
43
+ model_kwargs.update(dict(torch_dtype=torch.float16))
44
+ if 'gpt-j' in model_name:
45
+ model_kwargs.update(dict(revision='float16'))
46
+ model = from_pretrained(AutoModelForCausalLM, model_fullname, model_kwargs, cache_dir)
47
+ print('Moving model to GPU...', end='', flush=True)
48
+ start = time.time()
49
+ model.to(device)
50
+ print(f'DONE ({time.time() - start:.2f}s)')
51
+ return model
52
+
53
+ def load_tokenizer(model_name, for_dataset, cache_dir):
54
+ model_fullname = get_model_fullname(model_name)
55
+ optional_tok_kwargs = {}
56
+ if "facebook/opt-" in model_fullname:
57
+ print("Using non-fast tokenizer for OPT")
58
+ optional_tok_kwargs['fast'] = False
59
+ if for_dataset in ['pubmed']:
60
+ optional_tok_kwargs['padding_side'] = 'left'
61
+ else:
62
+ optional_tok_kwargs['padding_side'] = 'right'
63
+ base_tokenizer = from_pretrained(AutoTokenizer, model_fullname, optional_tok_kwargs, cache_dir=cache_dir)
64
+ if base_tokenizer.pad_token_id is None:
65
+ base_tokenizer.pad_token_id = base_tokenizer.eos_token_id
66
+ if '13b' in model_fullname:
67
+ base_tokenizer.pad_token_id = 0
68
+ return base_tokenizer
69
+
70
+
71
+ if __name__ == '__main__':
72
+ import argparse
73
+ parser = argparse.ArgumentParser()
74
+ parser.add_argument('--model_name', type=str, default="bloom-7b1")
75
+ parser.add_argument('--cache_dir', type=str, default="../cache")
76
+ args = parser.parse_args()
77
+
78
+ load_tokenizer(args.model_name, 'xsum', args.cache_dir)
79
+ load_model(args.model_name, 'cpu', args.cache_dir)
paraphrasing.py ADDED
@@ -0,0 +1,106 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import random
2
+
3
+ import torch
4
+ from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
5
+ import numpy as np
6
+ import nltk
7
+ from data_builder import load_data, save_data
8
+ from model import from_pretrained
9
+
10
+ class T5Paraphraser:
11
+ def __init__(self, args):
12
+ self.device = args.device
13
+ self.tokenizer = from_pretrained(AutoTokenizer, args.t5_model_name, {}, args.cache_dir)
14
+ self.model = from_pretrained(AutoModelForSeq2SeqLM, args.t5_model_name, {}, args.cache_dir)
15
+ self.model = self.model.to(args.device)
16
+ self.model.eval()
17
+
18
+ def paraphrase(self, sents):
19
+ parabatch = ["paraphrase: " + sent + " </s>" for sent in sents]
20
+ encoding = self.tokenizer(parabatch, padding=True, return_tensors="pt")
21
+ input_ids, attention_masks = encoding["input_ids"].to(self.device), encoding["attention_mask"].to(self.device)
22
+ outputs = self.model.generate(
23
+ input_ids=input_ids, attention_mask=attention_masks,
24
+ max_length=256,
25
+ do_sample=True,
26
+ top_k=200,
27
+ top_p=0.95,
28
+ early_stopping=True,
29
+ num_return_sequences=1
30
+ )
31
+ assert len(sents) == len(outputs)
32
+ results = []
33
+ for output, sent in zip(outputs, sents):
34
+ line = self.tokenizer.decode(output, skip_special_tokens=True, clean_up_tokenization_spaces=True)
35
+ line = line.strip()
36
+ line = line if len(line) > 0 else sent
37
+ results.append(line)
38
+ return results
39
+
40
+ class RandomParaphraser:
41
+ def __init__(self, args):
42
+ self.device = args.device
43
+
44
+ def paraphrase(self, sents):
45
+ results = []
46
+ for sent in sents:
47
+ words = sent.split()
48
+ if len(words) > 20:
49
+ idx = random.randint(0, len(words) - 2)
50
+ words[idx], words[idx+1] = words[idx+1], words[idx]
51
+ results.append(' '.join(words))
52
+ return results
53
+
54
+ def generate_data(args):
55
+ data = load_data(args.dataset_file)
56
+ originals = data['original']
57
+ samples = data['sampled']
58
+ print(f"Total number of samples: {len(samples)}")
59
+ print(f"Average number of words: {np.mean([len(x.split()) for x in samples])}")
60
+
61
+ if args.do_random_para:
62
+ print(f'Using random paraphraser.')
63
+ paraphraser = RandomParaphraser(args)
64
+ else:
65
+ print(f'Loading model {args.t5_model_name}...')
66
+ paraphraser = T5Paraphraser(args)
67
+
68
+ new_samples = []
69
+ for sample in tqdm(samples):
70
+ lines = sample.split('\n')
71
+ new_lines = []
72
+ for line in lines:
73
+ line = line.strip()
74
+ if len(line) == 0:
75
+ new_lines.append(line)
76
+ else:
77
+ sents = nltk.sent_tokenize(line)
78
+ new_sents = paraphraser.paraphrase(sents)
79
+ new_lines.append(' '.join(new_sents))
80
+ new_samples.append('\n'.join(new_lines))
81
+
82
+ new_data = {'original': originals, 'sampled': new_samples}
83
+ save_data(args.output_file, args, new_data)
84
+
85
+
86
+ if __name__ == '__main__':
87
+ import argparse
88
+ from tqdm import tqdm
89
+ parser = argparse.ArgumentParser()
90
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
91
+ parser.add_argument('--dataset', type=str, default="xsum")
92
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
93
+ parser.add_argument('--t5_model_name', type=str, default="Vamsi/T5_Paraphrase_Paws")
94
+ parser.add_argument('--paraphraser', type=str, default="t5", choices=["t5", "random"])
95
+ parser.add_argument('--seed', type=int, default=0)
96
+ parser.add_argument('--device', type=str, default="cuda")
97
+ parser.add_argument('--cache_dir', type=str, default="../cache")
98
+ args = parser.parse_args()
99
+
100
+ torch.manual_seed(args.seed)
101
+ np.random.seed(args.seed)
102
+
103
+ import nltk
104
+ nltk.download('punkt')
105
+
106
+ generate_data(args)
report_results.py ADDED
@@ -0,0 +1,490 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+ import os.path
6
+ import argparse
7
+ import json
8
+ import numpy as np
9
+
10
+
11
+ def save_lines(lines, file):
12
+ with open(file, 'w') as fout:
13
+ fout.write('\n'.join(lines))
14
+
15
+ def get_auroc(result_file):
16
+ with open(result_file, 'r') as fin:
17
+ res = json.load(fin)
18
+ return res['metrics']['roc_auc']
19
+
20
+ def get_fpr_tpr(result_file):
21
+ with open(result_file, 'r') as fin:
22
+ res = json.load(fin)
23
+ return res['metrics']['fpr'], res['metrics']['tpr']
24
+
25
+ def report_main_results(args):
26
+ datasets = {'xsum': 'XSum',
27
+ 'squad': 'SQuAD',
28
+ 'writing': 'WritingPrompts'}
29
+ source_models = {'gpt2-xl': 'GPT-2',
30
+ 'opt-2.7b': 'OPT-2.7',
31
+ 'gpt-neo-2.7B': 'Neo-2.7',
32
+ 'gpt-j-6B': 'GPT-J',
33
+ 'gpt-neox-20b': 'NeoX'}
34
+ methods1 = {'likelihood': 'Likelihood',
35
+ 'entropy': 'Entropy',
36
+ 'logrank': 'LogRank',
37
+ 'lrr': 'LRR',
38
+ 'npr': 'NPR'}
39
+ methods2 = {'perturbation_100': 'DetectGPT',
40
+ 'sampling_discrepancy': 'Fast-DetectGPT'}
41
+
42
+ def _get_method_aurocs(dataset, method, filter=''):
43
+ cols = []
44
+ for model in source_models:
45
+ result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
46
+ if os.path.exists(result_file):
47
+ auroc = get_auroc(result_file)
48
+ else:
49
+ auroc = 0.0
50
+ cols.append(auroc)
51
+ cols.append(np.mean(cols))
52
+ return cols
53
+
54
+ headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
55
+ for dataset in datasets:
56
+ print('----')
57
+ print(datasets[dataset])
58
+ print('----')
59
+ print(' '.join(headers))
60
+ # basic methods
61
+ for method in methods1:
62
+ method_name = methods1[method]
63
+ cols = _get_method_aurocs(dataset, method)
64
+ cols = [f'{col:.4f}' for col in cols]
65
+ print(method_name, ' '.join(cols))
66
+ # white-box comparison
67
+ results = {}
68
+ for method in methods2:
69
+ method_name = methods2[method]
70
+ cols = _get_method_aurocs(dataset, method)
71
+ results[method_name] = cols
72
+ cols = [f'{col:.4f}' for col in cols]
73
+ print(method_name, ' '.join(cols))
74
+ cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
75
+ cols = [f'{col:.4f}' for col in cols]
76
+ print('(Diff)', ' '.join(cols))
77
+ # black-box comparison
78
+ filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
79
+ 'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
80
+ results = {}
81
+ for method in methods2:
82
+ method_name = methods2[method]
83
+ cols = _get_method_aurocs(dataset, method, filters[method])
84
+ results[method_name] = cols
85
+ cols = [f'{col:.4f}' for col in cols]
86
+ print(method_name, ' '.join(cols))
87
+ cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
88
+ cols = [f'{col:.4f}' for col in cols]
89
+ print('(Diff)', ' '.join(cols))
90
+
91
+ def report_main_ext_results(args):
92
+ datasets = {'xsum': 'XSum',
93
+ 'squad': 'SQuAD',
94
+ 'writing': 'WritingPrompts'}
95
+ source_models = {'bloom-7b1': 'BLOOM-7.1',
96
+ 'opt-13b': 'OPT-13',
97
+ 'llama-13b': 'Llama-13',
98
+ 'llama2-13b': 'Llama2-13',
99
+ }
100
+ methods1 = {'likelihood': 'Likelihood',
101
+ 'entropy': 'Entropy',
102
+ 'logrank': 'LogRank',
103
+ 'lrr': 'LRR',
104
+ 'npr': 'NPR'}
105
+ methods2 = {'perturbation_100': 'DetectGPT',
106
+ 'sampling_discrepancy': 'Fast-DetectGPT'}
107
+
108
+ def _get_method_aurocs(dataset, method, filter=''):
109
+ cols = []
110
+ for model in source_models:
111
+ result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
112
+ if os.path.exists(result_file):
113
+ auroc = get_auroc(result_file)
114
+ else:
115
+ auroc = 0.0
116
+ cols.append(auroc)
117
+ cols.append(np.mean(cols))
118
+ return cols
119
+
120
+ headers = ['Method'] + [source_models[model] for model in source_models] + ['Avg.']
121
+ for dataset in datasets:
122
+ print('----')
123
+ print(datasets[dataset])
124
+ print('----')
125
+ print(' '.join(headers))
126
+ # basic methods
127
+ for method in methods1:
128
+ method_name = methods1[method]
129
+ cols = _get_method_aurocs(dataset, method)
130
+ cols = [f'{col:.4f}' for col in cols]
131
+ print(method_name, ' '.join(cols))
132
+ # white-box comparison
133
+ results = {}
134
+ for method in methods2:
135
+ method_name = methods2[method]
136
+ cols = _get_method_aurocs(dataset, method)
137
+ results[method_name] = cols
138
+ cols = [f'{col:.4f}' for col in cols]
139
+ print(method_name, ' '.join(cols))
140
+ cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
141
+ cols = [f'{col:.4f}' for col in cols]
142
+ print('(Diff)', ' '.join(cols))
143
+ # black-box comparison
144
+ filters = {'perturbation_100': '.t5-3b_gpt-neo-2.7B',
145
+ 'sampling_discrepancy': '.gpt-j-6B_gpt-neo-2.7B'}
146
+ results = {}
147
+ for method in methods2:
148
+ method_name = methods2[method]
149
+ cols = _get_method_aurocs(dataset, method, filters[method])
150
+ results[method_name] = cols
151
+ cols = [f'{col:.4f}' for col in cols]
152
+ print(method_name, ' '.join(cols))
153
+ cols = np.array(results['Fast-DetectGPT']) - np.array(results['DetectGPT'])
154
+ cols = [f'{col:.4f}' for col in cols]
155
+ print('(Diff)', ' '.join(cols))
156
+
157
+ def report_refmodel_results(args):
158
+ datasets = {'xsum': 'XSum',
159
+ 'squad': 'SQuAD',
160
+ 'writing': 'WritingPrompts'}
161
+ source_models = {'gpt2-xl': 'GPT-2',
162
+ 'gpt-neo-2.7B': 'Neo-2.7',
163
+ 'gpt-j-6B': 'GPT-J'}
164
+
165
+ def _get_method_aurocs(method, ref_model=None):
166
+ cols = []
167
+ for dataset in datasets:
168
+ for model in source_models:
169
+ filter = '' if ref_model is None or ref_model == model else f'.{ref_model}_{model}'
170
+ result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
171
+ if os.path.exists(result_file):
172
+ auroc = get_auroc(result_file)
173
+ else:
174
+ auroc = 0.0
175
+ cols.append(auroc)
176
+ cols.append(np.mean(cols))
177
+ return cols
178
+
179
+ headers1 = ['----'] + list([datasets[d] for d in datasets])
180
+ headers2 = ['Method'] + [source_models[model] for model in source_models] \
181
+ + [source_models[model] for model in source_models] \
182
+ + [source_models[model] for model in source_models] \
183
+ + ['Avg.']
184
+ print(' '.join(headers1))
185
+ print(' '.join(headers2))
186
+
187
+ ref_models = [None, 'gpt2-xl', 'gpt-neo-2.7B', 'gpt-j-6B']
188
+ for ref_model in ref_models:
189
+ method = 'sampling_discrepancy'
190
+ method_name = 'Fast-DetectGPT (*/*)' if ref_model is None else f'Fast-DetectGPT ({source_models[ref_model]}/*)'
191
+ cols = _get_method_aurocs(method, ref_model)
192
+ cols = [f'{col:.4f}' for col in cols]
193
+ print(method_name, ' '.join(cols))
194
+
195
+
196
+ def report_chatgpt_gpt4_results(args):
197
+ datasets = {'xsum': 'XSum',
198
+ 'writing': 'Writing',
199
+ 'pubmed': 'PubMed'}
200
+ source_models = {'gpt-3.5-turbo': 'ChatGPT',
201
+ 'gpt-4': 'GPT-4'}
202
+ score_models = { 't5-11b': 'T5-11B',
203
+ 'gpt2-xl': 'GPT-2',
204
+ 'opt-2.7b': 'OPT-2.7',
205
+ 'gpt-neo-2.7B': 'Neo-2.7',
206
+ 'gpt-j-6B': 'GPT-J',
207
+ 'gpt-neox-20b': 'NeoX'}
208
+ methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
209
+ 'roberta-large-openai-detector': 'RoBERTa-large'}
210
+ methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
211
+ methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
212
+ 'sampling_discrepancy_analytic': 'Fast'}
213
+
214
+ def _get_method_aurocs(method, filter=''):
215
+ results = []
216
+ for model in source_models:
217
+ cols = []
218
+ for dataset in datasets:
219
+ result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
220
+ if os.path.exists(result_file):
221
+ auroc = get_auroc(result_file)
222
+ else:
223
+ auroc = 0.0
224
+ cols.append(auroc)
225
+ cols.append(np.mean(cols))
226
+ results.extend(cols)
227
+ return results
228
+
229
+ headers1 = ['--'] + [source_models[model] for model in source_models]
230
+ headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
231
+ + [datasets[dataset] for dataset in datasets] + ['Avg.']
232
+ print(' '.join(headers1))
233
+ print(' '.join(headers2))
234
+ # supervised methods
235
+ for method in methods1:
236
+ method_name = methods1[method]
237
+ cols = _get_method_aurocs(method)
238
+ cols = [f'{col:.4f}' for col in cols]
239
+ print(method_name, ' '.join(cols))
240
+ # zero-shot methods
241
+
242
+ filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
243
+ 'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
244
+ 'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
245
+ filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
246
+ 'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
247
+ 'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
248
+ 'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
249
+ for method in methods2:
250
+ for filter in filters2[method]:
251
+ setting = score_models[filter[1:]]
252
+ method_name = f'{methods2[method]}({setting})'
253
+ cols = _get_method_aurocs(method, filter)
254
+ cols = [f'{col:.4f}' for col in cols]
255
+ print(method_name, ' '.join(cols))
256
+ for method in methods3:
257
+ for filter in filters3[method]:
258
+ setting = [score_models[model] for model in filter[1:].split('_')]
259
+ method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
260
+ cols = _get_method_aurocs(method, filter)
261
+ cols = [f'{col:.4f}' for col in cols]
262
+ print(method_name, ' '.join(cols))
263
+
264
+ def report_gpt3_results(args):
265
+ datasets = {'xsum': 'XSum',
266
+ 'writing': 'Writing',
267
+ 'pubmed': 'PubMed'}
268
+ source_models = {'davinci': 'GPT-3'}
269
+ score_models = { 't5-11b': 'T5-11B',
270
+ 'gpt2-xl': 'GPT-2',
271
+ 'opt-2.7b': 'OPT-2.7',
272
+ 'gpt-neo-2.7B': 'Neo-2.7',
273
+ 'gpt-j-6B': 'GPT-J',
274
+ 'gpt-neox-20b': 'NeoX'}
275
+ methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
276
+ 'roberta-large-openai-detector': 'RoBERTa-large'}
277
+ methods2 = {'likelihood': 'Likelihood', 'entropy': 'Entropy', 'logrank': 'LogRank'}
278
+ methods3 = {'lrr': 'LRR', 'npr': 'NPR', 'perturbation_100': 'DetectGPT',
279
+ 'sampling_discrepancy_analytic': 'Fast'}
280
+
281
+ def _get_method_aurocs(method, filter=''):
282
+ results = []
283
+ for model in source_models:
284
+ cols = []
285
+ for dataset in datasets:
286
+ result_file = f'{args.result_path}/{dataset}_{model}{filter}.{method}.json'
287
+ if os.path.exists(result_file):
288
+ auroc = get_auroc(result_file)
289
+ else:
290
+ auroc = 0.0
291
+ cols.append(auroc)
292
+ cols.append(np.mean(cols))
293
+ results.extend(cols)
294
+ return results
295
+
296
+ headers1 = ['--'] + [source_models[model] for model in source_models]
297
+ headers2 = ['Method'] + [datasets[dataset] for dataset in datasets] + ['Avg.'] \
298
+ + [datasets[dataset] for dataset in datasets] + ['Avg.']
299
+ print(' '.join(headers1))
300
+ print(' '.join(headers2))
301
+ # supervised methods
302
+ for method in methods1:
303
+ method_name = methods1[method]
304
+ cols = _get_method_aurocs(method)
305
+ cols = [f'{col:.4f}' for col in cols]
306
+ print(method_name, ' '.join(cols))
307
+ # zero-shot methods
308
+
309
+ filters2 = {'likelihood': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
310
+ 'entropy': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b'],
311
+ 'logrank': ['.gpt2-xl', '.gpt-neo-2.7B', '.gpt-j-6B', '.gpt-neox-20b']}
312
+ filters3 = {'lrr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
313
+ 'npr': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
314
+ 'perturbation_100': ['.t5-11b_gpt2-xl', '.t5-11b_gpt-neo-2.7B', '.t5-11b_gpt-j-6B', '.t5-11b_gpt-neox-20b'],
315
+ 'sampling_discrepancy_analytic': ['.gpt-j-6B_gpt2-xl', '.gpt-j-6B_gpt-neo-2.7B', '.gpt-j-6B_gpt-j-6B', '.gpt-neox-20b_gpt-neox-20b']}
316
+ for method in methods2:
317
+ for filter in filters2[method]:
318
+ setting = score_models[filter[1:]]
319
+ method_name = f'{methods2[method]}({setting})'
320
+ cols = _get_method_aurocs(method, filter)
321
+ cols = [f'{col:.4f}' for col in cols]
322
+ print(method_name, ' '.join(cols))
323
+ for method in methods3:
324
+ for filter in filters3[method]:
325
+ setting = [score_models[model] for model in filter[1:].split('_')]
326
+ method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
327
+ cols = _get_method_aurocs(method, filter)
328
+ cols = [f'{col:.4f}' for col in cols]
329
+ print(method_name, ' '.join(cols))
330
+
331
+ def report_maxlen_trends(args):
332
+ datasets = {'xsum': 'XSum',
333
+ 'writing': 'WritingPrompts'}
334
+ source_models = {'gpt-3.5-turbo': 'ChatGPT',
335
+ 'gpt-4': 'GPT-4'}
336
+ score_models = {'t5-11b': 'T5-11B',
337
+ 'gpt2-xl': 'GPT-2',
338
+ 'opt-2.7b': 'OPT-2.7',
339
+ 'gpt-neo-2.7B': 'Neo-2.7',
340
+ 'gpt-j-6B': 'GPT-J',
341
+ 'gpt-neox-20b': 'NeoX'}
342
+ methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
343
+ 'roberta-large-openai-detector': 'RoBERTa-large'}
344
+ methods2 = {'likelihood': 'Likelihood'}
345
+ methods3 = {'perturbation_100': 'DetectGPT',
346
+ 'sampling_discrepancy_analytic': 'Fast-Detect'}
347
+ maxlens = [30, 60, 90, 120, 150, 180]
348
+
349
+ def _get_method_aurocs(root_path, dataset, source_model, method, filter=''):
350
+ cols = []
351
+ for maxlen in maxlens:
352
+ result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
353
+ if os.path.exists(result_file):
354
+ auroc = get_auroc(result_file)
355
+ else:
356
+ auroc = 0.0
357
+ cols.append(auroc)
358
+ return cols
359
+
360
+ filters2 = {'likelihood': '.gpt-neo-2.7B'}
361
+ filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
362
+ 'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
363
+
364
+ headers = ['Method'] + [str(maxlen) for maxlen in maxlens]
365
+ print(' '.join(headers))
366
+ # print table per model and dataset
367
+ results = {}
368
+ for model in source_models:
369
+ model_name = source_models[model]
370
+ for data in datasets:
371
+ data_name = datasets[data]
372
+ print('----')
373
+ print(f'{model_name} / {data_name}')
374
+ print('----')
375
+ for method in methods1:
376
+ method_name = methods1[method]
377
+ cols = _get_method_aurocs('.', data, model, method)
378
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
379
+ cols = [f'{col:.4f}' for col in cols]
380
+ print(method_name, ' '.join(cols))
381
+ for method in methods2:
382
+ filter = filters2[method]
383
+ setting = score_models[filter[1:]]
384
+ method_name = f'{methods2[method]}({setting})'
385
+ cols = _get_method_aurocs('.', data, model, method, filter)
386
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
387
+ cols = [f'{col:.4f}' for col in cols]
388
+ print(method_name, ' '.join(cols))
389
+ for method in methods3:
390
+ filter = filters3[method]
391
+ setting = [score_models[model] for model in filter[1:].split('_')]
392
+ method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
393
+ cols = _get_method_aurocs('.', data, model, method, filter)
394
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
395
+ cols = [f'{col:.4f}' for col in cols]
396
+ print(method_name, ' '.join(cols))
397
+ import json
398
+ json_file = './exp_analysis/maxlen_trends.json'
399
+ with open(json_file, 'w') as fout:
400
+ json.dump(results, fout)
401
+ print(f'Write to file {json_file}')
402
+
403
+ def report_auroc_curve(args):
404
+ datasets = {'xsum': 'XSum',
405
+ 'writing': 'WritingPrompts'}
406
+ source_models = {'gpt-3.5-turbo': 'ChatGPT',
407
+ 'gpt-4': 'GPT-4'}
408
+ score_models = {'t5-11b': 'T5-11B',
409
+ 'gpt2-xl': 'GPT-2',
410
+ 'opt-2.7b': 'OPT-2.7',
411
+ 'gpt-neo-2.7B': 'Neo-2.7',
412
+ 'gpt-j-6B': 'GPT-J',
413
+ 'gpt-neox-20b': 'NeoX'}
414
+ methods1 = {'roberta-base-openai-detector': 'RoBERTa-base',
415
+ 'roberta-large-openai-detector': 'RoBERTa-large'}
416
+ methods2 = {'likelihood': 'Likelihood'}
417
+ methods3 = {'perturbation_100': 'DetectGPT',
418
+ 'sampling_discrepancy_analytic': 'Fast-Detect'}
419
+
420
+ def _get_method_fpr_tpr(root_path, dataset, source_model, method, filter=''):
421
+ maxlen = 180
422
+ result_file = f'{root_path}/exp_maxlen{maxlen}/results/{dataset}_{source_model}{filter}.{method}.json'
423
+ if os.path.exists(result_file):
424
+ fpr, tpr = get_fpr_tpr(result_file)
425
+ else:
426
+ fpr, tpr = [], []
427
+ assert len(fpr) == len(tpr)
428
+ return list(zip(fpr, tpr))
429
+
430
+ filters2 = {'likelihood': '.gpt-neo-2.7B'}
431
+ filters3 = {'perturbation_100': '.t5-11b_gpt-neo-2.7B',
432
+ 'sampling_discrepancy_analytic': '.gpt-j-6B_gpt-neo-2.7B'}
433
+
434
+ # print table per model and dataset
435
+ results = {}
436
+ for model in source_models:
437
+ model_name = source_models[model]
438
+ for data in datasets:
439
+ data_name = datasets[data]
440
+ print('----')
441
+ print(f'{model_name} / {data_name}')
442
+ print('----')
443
+ for method in methods1:
444
+ method_name = methods1[method]
445
+ cols = _get_method_fpr_tpr('.', data, model, method)
446
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
447
+ cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
448
+ print(method_name, ' '.join(cols))
449
+ for method in methods2:
450
+ filter = filters2[method]
451
+ setting = score_models[filter[1:]]
452
+ method_name = f'{methods2[method]}({setting})'
453
+ cols = _get_method_fpr_tpr('.', data, model, method, filter)
454
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
455
+ cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
456
+ print(method_name, ' '.join(cols))
457
+ for method in methods3:
458
+ filter = filters3[method]
459
+ setting = [score_models[model] for model in filter[1:].split('_')]
460
+ method_name = f'{methods3[method]}({setting[0]}/{setting[1]})'
461
+ cols = _get_method_fpr_tpr('.', data, model, method, filter)
462
+ results[f'{model_name}_{data_name}_{method_name}'] = cols
463
+ cols = [f'({col[0]:.3f},{col[1]:.3f})' for col in cols]
464
+ print(method_name, ' '.join(cols))
465
+ import json
466
+ json_file = './exp_analysis/auroc_curve.json'
467
+ with open(json_file, 'w') as fout:
468
+ json.dump(results, fout)
469
+ print(f'Write to file {json_file}')
470
+
471
+ if __name__ == '__main__':
472
+ parser = argparse.ArgumentParser()
473
+ parser.add_argument('--result_path', type=str, default="./exp_main/results/")
474
+ parser.add_argument('--report_name', type=str, default="main_results")
475
+ args = parser.parse_args()
476
+
477
+ if args.report_name == 'main_results':
478
+ report_main_results(args)
479
+ elif args.report_name == 'main_ext_results':
480
+ report_main_ext_results(args)
481
+ elif args.report_name == 'chatgpt_gpt4_results':
482
+ report_chatgpt_gpt4_results(args)
483
+ elif args.report_name == 'gpt3_results':
484
+ report_gpt3_results(args)
485
+ elif args.report_name == 'maxlen_trends':
486
+ report_maxlen_trends(args)
487
+ elif args.report_name == 'auroc_curve':
488
+ report_auroc_curve(args)
489
+ elif args.report_name == 'refmodel_results':
490
+ report_refmodel_results(args)
requirements.txt CHANGED
@@ -1,3 +1,8 @@
1
- streamlit
2
- transformers
3
- torch==2.0.0
 
 
 
 
 
 
1
+ torch
2
+ numpy
3
+ transformers==4.28.1
4
+ datasets==2.12.0
5
+ matplotlib
6
+ tqdm
7
+ openai
8
+ nltk
setup.sh ADDED
@@ -0,0 +1 @@
 
 
1
+ pip install -r requirements.txt
show_result.py ADDED
@@ -0,0 +1,51 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ import matplotlib
7
+ import matplotlib.pyplot as plt
8
+ import argparse
9
+ import glob
10
+ import json
11
+ from os import path
12
+
13
+ import numpy as np
14
+
15
+ matplotlib.use('Agg')
16
+
17
+ # plot histogram of sampled on left, and original on right
18
+ def save_histogram(predictions, figure_file):
19
+ plt.figure(figsize=(4, 2.5))
20
+ plt.subplot(1, 1, 1)
21
+ plt.hist(predictions["samples"], alpha=0.5, bins='auto', label='Model')
22
+ plt.hist(predictions["real"], alpha=0.5, bins='auto', label='Human')
23
+ plt.xlabel("Sampling Discrepancy")
24
+ plt.ylabel('Frequency')
25
+ plt.legend(loc='upper right')
26
+ plt.tight_layout()
27
+ plt.savefig(figure_file)
28
+
29
+ if __name__ == '__main__':
30
+ parser = argparse.ArgumentParser()
31
+ parser.add_argument('--result_files', type=str, default="./exp_test/results/*.json")
32
+ parser.add_argument('--draw', action='store_true')
33
+ args = parser.parse_args()
34
+
35
+ for res_file in glob.glob(args.result_files, recursive=True):
36
+ with open(res_file, 'r') as fin:
37
+ res = json.load(fin)
38
+ if 'metrics' in res:
39
+ n_samples = res['info']['n_samples']
40
+ roc_auc = res['metrics']['roc_auc']
41
+ real = res['predictions']['real']
42
+ samples = res['predictions']['samples']
43
+ print(f"{res_file}: roc_auc={roc_auc:.4f} n_samples={n_samples} r:{np.mean(real):.2f}/{np.std(real):.2f} s:{np.mean(samples):.2f}/{np.std(samples):.2f}")
44
+ else:
45
+ print(f"{res_file}: metrics not found.")
46
+ # draw histogram
47
+ if args.draw:
48
+ fig_file = f"{res_file}.pdf"
49
+ save_histogram(res['predictions'], fig_file)
50
+ print(f"{fig_file}: histogram figure saved.")
51
+
supervised.py ADDED
@@ -0,0 +1,78 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Copyright (c) Guangsheng Bao.
2
+ #
3
+ # This source code is licensed under the MIT license found in the
4
+ # LICENSE file in the root directory of this source tree.
5
+
6
+ import numpy as np
7
+ import torch
8
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
9
+ import tqdm
10
+ import argparse
11
+ import json
12
+ from data_builder import load_data
13
+ from metrics import get_roc_metrics, get_precision_recall_metrics
14
+ from model import from_pretrained
15
+
16
+ def experiment(args):
17
+ # load model
18
+ print(f'Beginning supervised evaluation with {args.model_name}...')
19
+ detector = from_pretrained(AutoModelForSequenceClassification, args.model_name, {}, args.cache_dir).to(args.device)
20
+ tokenizer = from_pretrained(AutoTokenizer, args.model_name, {}, args.cache_dir)
21
+ detector.eval()
22
+ # load data
23
+ data = load_data(args.dataset_file)
24
+ n_samples = len(data["sampled"])
25
+ # eval detector
26
+ name = args.model_name
27
+ torch.manual_seed(args.seed)
28
+ np.random.seed(args.seed)
29
+ eval_results = []
30
+ for idx in tqdm.tqdm(range(n_samples), desc=f"Computing {name} criterion"):
31
+ original_text = data["original"][idx]
32
+ sampled_text = data["sampled"][idx]
33
+ # original text
34
+ tokenized = tokenizer(original_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
35
+ with torch.no_grad():
36
+ original_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
37
+ # sampled text
38
+ tokenized = tokenizer(sampled_text, padding=True, truncation=True, max_length=512, return_tensors="pt").to(args.device)
39
+ with torch.no_grad():
40
+ sampled_crit = detector(**tokenized).logits.softmax(-1)[0, 0].item()
41
+ # result
42
+ eval_results.append({"original": original_text,
43
+ "original_crit": original_crit,
44
+ "sampled": sampled_text,
45
+ "sampled_crit": sampled_crit})
46
+
47
+ # compute prediction scores for real/sampled passages
48
+ predictions = {'real': [x["original_crit"] for x in eval_results],
49
+ 'samples': [x["sampled_crit"] for x in eval_results]}
50
+ fpr, tpr, roc_auc = get_roc_metrics(predictions['real'], predictions['samples'])
51
+ p, r, pr_auc = get_precision_recall_metrics(predictions['real'], predictions['samples'])
52
+ print(f"Criterion {name}_threshold ROC AUC: {roc_auc:.4f}, PR AUC: {pr_auc:.4f}")
53
+ # log results
54
+ results_file = f'{args.output_file}.{name}.json'
55
+ results = { 'name': f'{name}_threshold',
56
+ 'info': {'n_samples': n_samples},
57
+ 'predictions': predictions,
58
+ 'raw_results': eval_results,
59
+ 'metrics': {'roc_auc': roc_auc, 'fpr': fpr, 'tpr': tpr},
60
+ 'pr_metrics': {'pr_auc': pr_auc, 'precision': p, 'recall': r},
61
+ 'loss': 1 - pr_auc}
62
+ with open(results_file, 'w') as fout:
63
+ json.dump(results, fout)
64
+ print(f'Results written into {results_file}')
65
+
66
+
67
+ if __name__ == '__main__':
68
+ parser = argparse.ArgumentParser()
69
+ parser.add_argument('--output_file', type=str, default="./exp_test/results/xsum_gpt2")
70
+ parser.add_argument('--dataset', type=str, default="xsum")
71
+ parser.add_argument('--dataset_file', type=str, default="./exp_test/data/xsum_gpt2")
72
+ parser.add_argument('--model_name', type=str, default="roberta-base-openai-detector")
73
+ parser.add_argument('--seed', type=int, default=0)
74
+ parser.add_argument('--device', type=str, default="cuda")
75
+ parser.add_argument('--cache_dir', type=str, default="../cache")
76
+ args = parser.parse_args()
77
+
78
+ experiment(args)
supervised.sh ADDED
@@ -0,0 +1,56 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_supervised
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ # preparing dataset
18
+ for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
19
+ IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
20
+ echo `date`, Preparing dataset ${D}-${M} ...
21
+ python scripts/data_builder.py --dataset $D --n_samples 200 --base_model_name $M --output_file $data_path/${D}_${M}
22
+ done
23
+
24
+ # evaluate baselines
25
+ for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
26
+ IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
27
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
28
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
29
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
30
+ done
31
+
32
+ # evaluate supervised detectors
33
+ for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
34
+ IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
35
+ for SM in roberta-base-openai-detector roberta-large-openai-detector; do
36
+ echo `date`, Evaluating ${SM} on ${D}_${M} ...
37
+ python scripts/supervised.py --model_name $SM --dataset $D \
38
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
39
+ done
40
+ done
41
+
42
+ # evaluate DetectGPT
43
+ for P in "english:mgpt:mt5-xl" "german:mgpt:mt5-xl" "pubmed:pubmedgpt:t5-11b" "xsum:gpt2-xl:t5-11b"; do
44
+ IFS=':' read -r -a P <<< $P && D=${P[0]} && M1=${P[1]} && M2=${P[2]}
45
+ echo `date`, Evaluating DetectGPT on ${D}_${M1}_${M2} ...
46
+ python scripts/detect_gpt.py --scoring_model_name $M1 --mask_filling_model_name $M2 --n_perturbations 100 --dataset $D \
47
+ --dataset_file $data_path/${D}_${M1} --output_file $res_path/${D}_${M1}_${M2}
48
+ done
49
+
50
+ # evaluate Fast-DetectGPT
51
+ for P in "english:mgpt" "german:mgpt" "pubmed:pubmedgpt" "xsum:gpt2-xl"; do
52
+ IFS=':' read -r -a P <<< $P && D=${P[0]} && M=${P[1]}
53
+ echo `date`, Evaluating Fast-DetectGPT on ${D}-${M} ...
54
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M \
55
+ --dataset $D --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
56
+ done
temperature.sh ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_temperature
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum squad writing"
18
+ source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
19
+
20
+ # preparing dataset
21
+ for D in $datasets; do
22
+ for M in $source_models; do
23
+ echo `date`, Preparing dataset ${D}-${M} ...
24
+ python scripts/data_builder.py --dataset $D --n_samples 500 --do_temperature --base_model_name $M --output_file $data_path/${D}_${M}
25
+ done
26
+ done
27
+
28
+ # White-box Setting
29
+ echo `date`, Evaluate models in the white-box setting:
30
+
31
+ # evaluate Fast-DetectGPT and fast baselines
32
+ for D in $datasets; do
33
+ for M in $source_models; do
34
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
35
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
36
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
37
+
38
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
39
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
40
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
41
+ done
42
+ done
43
+
44
+ # evaluate DetectGPT and its improvement DetectLLM
45
+ for D in $datasets; do
46
+ for M in $source_models; do
47
+ echo `date`, Evaluating DetectGPT on ${D}_${M} ...
48
+ python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
49
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
50
+ # we leverage DetectGPT to generate the perturbations
51
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
52
+ python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
53
+ --dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
54
+ done
55
+ done
56
+
57
+
58
+ # Black-box Setting
59
+ echo `date`, Evaluate models in the black-box setting:
60
+ scoring_models="gpt-neo-2.7B"
61
+
62
+ # evaluate Fast-DetectGPT
63
+ for D in $datasets; do
64
+ for M in $source_models; do
65
+ M1=gpt-j-6B # sampling model
66
+ for M2 in $scoring_models; do
67
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
68
+ python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
69
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
70
+ done
71
+ done
72
+ done
73
+
74
+ # evaluate DetectGPT and its improvement DetectLLM
75
+ for D in $datasets; do
76
+ for M in $source_models; do
77
+ M1=t5-3b # perturbation model
78
+ for M2 in $scoring_models; do
79
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
80
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
81
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
82
+ # we leverage DetectGPT to generate the perturbations
83
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
84
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
85
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
86
+ done
87
+ done
88
+ done
topk.sh ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_topk
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum squad writing"
18
+ source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
19
+
20
+ # preparing dataset
21
+ for D in $datasets; do
22
+ for M in $source_models; do
23
+ echo `date`, Preparing dataset ${D}-${M} ...
24
+ python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_k --base_model_name $M --output_file $data_path/${D}_${M}
25
+ done
26
+ done
27
+
28
+ # White-box Setting
29
+ echo `date`, Evaluate models in the white-box setting:
30
+
31
+ # evaluate Fast-DetectGPT and fast baselines
32
+ for D in $datasets; do
33
+ for M in $source_models; do
34
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
35
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
36
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
37
+
38
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
39
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
40
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
41
+ done
42
+ done
43
+
44
+ # evaluate DetectGPT and its improvement DetectLLM
45
+ for D in $datasets; do
46
+ for M in $source_models; do
47
+ echo `date`, Evaluating DetectGPT on ${D}_${M} ...
48
+ python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
49
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
50
+ # we leverage DetectGPT to generate the perturbations
51
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
52
+ python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
53
+ --dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
54
+ done
55
+ done
56
+
57
+
58
+ # Black-box Setting
59
+ echo `date`, Evaluate models in the black-box setting:
60
+ scoring_models="gpt-neo-2.7B"
61
+
62
+ # evaluate Fast-DetectGPT
63
+ for D in $datasets; do
64
+ for M in $source_models; do
65
+ M1=gpt-j-6B # sampling model
66
+ for M2 in $scoring_models; do
67
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
68
+ python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
69
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
70
+ done
71
+ done
72
+ done
73
+
74
+ # evaluate DetectGPT and its improvement DetectLLM
75
+ for D in $datasets; do
76
+ for M in $source_models; do
77
+ M1=t5-3b # perturbation model
78
+ for M2 in $scoring_models; do
79
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
80
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
81
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
82
+ # we leverage DetectGPT to generate the perturbations
83
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
84
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
85
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
86
+ done
87
+ done
88
+ done
topp.sh ADDED
@@ -0,0 +1,88 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Copyright (c) Guangsheng Bao.
3
+ #
4
+ # This source code is licensed under the MIT license found in the
5
+ # LICENSE file in the root directory of this source tree.
6
+
7
+ # setup the environment
8
+ echo `date`, Setup the environment ...
9
+ set -e # exit if error
10
+
11
+ # prepare folders
12
+ exp_path=exp_topp
13
+ data_path=$exp_path/data
14
+ res_path=$exp_path/results
15
+ mkdir -p $exp_path $data_path $res_path
16
+
17
+ datasets="xsum squad writing"
18
+ source_models="gpt2-xl opt-2.7b gpt-neo-2.7B gpt-j-6B gpt-neox-20b"
19
+
20
+ # preparing dataset
21
+ for D in $datasets; do
22
+ for M in $source_models; do
23
+ echo `date`, Preparing dataset ${D}-${M} ...
24
+ python scripts/data_builder.py --dataset $D --n_samples 500 --do_top_p --base_model_name $M --output_file $data_path/${D}_${M}
25
+ done
26
+ done
27
+
28
+ # White-box Setting
29
+ echo `date`, Evaluate models in the white-box setting:
30
+
31
+ # evaluate Fast-DetectGPT and fast baselines
32
+ for D in $datasets; do
33
+ for M in $source_models; do
34
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M} ...
35
+ python scripts/fast_detect_gpt.py --reference_model_name $M --scoring_model_name $M --dataset $D \
36
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
37
+
38
+ echo `date`, Evaluating baseline methods on ${D}_${M} ...
39
+ python scripts/baselines.py --scoring_model_name $M --dataset $D \
40
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
41
+ done
42
+ done
43
+
44
+ # evaluate DetectGPT and its improvement DetectLLM
45
+ for D in $datasets; do
46
+ for M in $source_models; do
47
+ echo `date`, Evaluating DetectGPT on ${D}_${M} ...
48
+ python scripts/detect_gpt.py --scoring_model_name $M --mask_filling_model_name t5-3b --n_perturbations 100 --dataset $D \
49
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}
50
+ # we leverage DetectGPT to generate the perturbations
51
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M} ...
52
+ python scripts/detect_llm.py --scoring_model_name $M --dataset $D \
53
+ --dataset_file $data_path/${D}_${M}.t5-3b.perturbation_100 --output_file $res_path/${D}_${M}
54
+ done
55
+ done
56
+
57
+
58
+ # Black-box Setting
59
+ echo `date`, Evaluate models in the black-box setting:
60
+ scoring_models="gpt-neo-2.7B"
61
+
62
+ # evaluate Fast-DetectGPT
63
+ for D in $datasets; do
64
+ for M in $source_models; do
65
+ M1=gpt-j-6B # sampling model
66
+ for M2 in $scoring_models; do
67
+ echo `date`, Evaluating Fast-DetectGPT on ${D}_${M}.${M1}_${M2} ...
68
+ python scripts/fast_detect_gpt.py --reference_model_name ${M1} --scoring_model_name ${M2} --dataset $D \
69
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
70
+ done
71
+ done
72
+ done
73
+
74
+ # evaluate DetectGPT and its improvement DetectLLM
75
+ for D in $datasets; do
76
+ for M in $source_models; do
77
+ M1=t5-3b # perturbation model
78
+ for M2 in $scoring_models; do
79
+ echo `date`, Evaluating DetectGPT on ${D}_${M}.${M1}_${M2} ...
80
+ python scripts/detect_gpt.py --mask_filling_model_name ${M1} --scoring_model_name ${M2} --n_perturbations 100 --dataset $D \
81
+ --dataset_file $data_path/${D}_${M} --output_file $res_path/${D}_${M}.${M1}_${M2}
82
+ # we leverage DetectGPT to generate the perturbations
83
+ echo `date`, Evaluating DetectLLM methods on ${D}_${M}.${M1}_${M2} ...
84
+ python scripts/detect_llm.py --scoring_model_name ${M2} --dataset $D \
85
+ --dataset_file $data_path/${D}_${M}.${M1}.perturbation_100 --output_file $res_path/${D}_${M}.${M1}_${M2}
86
+ done
87
+ done
88
+ done