Machine-generated text-detection by fine-tuning of language models

This project is related to a bachelor's thesis with the title "Turning Poachers into Gamekeepers: Detecting Machine-Generated Text in Academia using Large Language Models" (see here) written by Nicolai Thorer Sivesind and Andreas Bentzen Winje at the Department of Computer Science at the Norwegian University of Science and Technology.

It contains text classification models trained to distinguish human-written text from text generated by language models like ChatGPT and GPT-3. The best models were able to achieve an accuracy of 100% on real and GPT-3-generated wikipedia articles (4500 samples), and an accuracy of 98.4% on real and ChatGPT-generated research abstracts (3000 samples).

The dataset card for the dataset that was created in relation to this project can be found here.

NOTE: the hosted inference on this site only works for the RoBERTa-models, and not for the Bloomz-models. The Bloomz-models otherwise can produce wrong predictions when not explicitly providing the attention mask from the tokenizer to the model for inference. To be sure, the pipeline-library seems to produce the most consistent results.

Fine-tuned detectors

This project includes 12 fine-tuned models based on the RoBERTa-base model, and three sizes of the bloomz-models.

Datasets

The models were trained on selections from the GPT-wiki-intros and ChatGPT-Research-Abstracts, and are separated into three types, wiki-detectors, academic-detectors and mixed-detectors, respectively.

  • Wiki-detectors:
    • Trained on 30'000 datapoints (10%) of GPT-wiki-intros.
    • Best model (in-domain) is Bloomz-3b-wiki, with an accuracy of 100%.
  • Academic-detectors:
    • Trained on 20'000 datapoints (100%) of ChatGPT-Research-Abstracts.
    • Best model (in-domain) is Bloomz-3b-academic, with an accuracy of 98.4%
  • Mixed-detectors:
    • Trained on 15'000 datapoints (5%) of GPT-wiki-intros and 10'000 datapoints (50%) of ChatGPT-Research-Abstracts.
    • Best model (in-domain) is RoBERTa-mixed, with an F1-score of 99.3%.

Hyperparameters

All models were trained using the same hyperparameters:

{
 "num_train_epochs": 1,
 "adam_beta1": 0.9,
 "adam_beta2": 0.999,
 "batch_size": 8,
 "adam_epsilon": 1e-08
 "optim": "adamw_torch" # the optimizer (AdamW)
 "learning_rate": 5e-05, # (LR)
 "lr_scheduler_type": "linear", # scheduler type for LR
 "seed": 42, # seed for PyTorch RNG-generator.
}

Metrics

Metrics can be found at https://wandb.ai/idatt2900-072/IDATT2900-072.

In-domain performance of wiki-detectors:

Base model Accuracy Precision Recall F1-score
Bloomz-560m 0.973 *1.000 0.945 0.972
Bloomz-1b7 0.972 *1.000 0.945 0.972
Bloomz-3b *1.000 *1.000 *1.000 *1.000
RoBERTa 0.998 0.999 0.997 0.998

In-domain peformance of academic-detectors:

Base model Accuracy Precision Recall F1-score
Bloomz-560m 0.964 0.963 0.965 0.964
Bloomz-1b7 0.946 0.941 0.951 0.946
Bloomz-3b *0.984 *0.983 0.985 *0.984
RoBERTa 0.982 0.968 *0.997 0.982

F1-scores of the mixed-detectors on all three datasets:

Base model Mixed Wiki CRA
Bloomz-560m 0.948 0.972 *0.848
Bloomz-1b7 0.929 0.964 0.816
Bloomz-3b 0.988 0.996 0.772
RoBERTa *0.993 *0.997 0.829

Credits

Citation

Please use the following citation:

@misc {sivesind_2023,
    author       = { {Nicolai Thorer Sivesind} and {Andreas Bentzen Winje} },
    title        = { Machine-generated text-detection by fine-tuning of language models },
    url          = { https://huggingface.co/andreas122001/roberta-academic-detector },
    year         = 2023,
    publisher    = { Hugging Face }
}
Downloads last month
6
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Datasets used to train andreas122001/bloomz-3b-mixed-detector