bert-base-cased-finetuned-ner-bio_nlp_2004

This model is a fine-tuned version of bert-base-cased.

It achieves the following results on the evaluation set:

  • Loss: 0.2066

  • Dna:

    • Precision: 0.6619127516778524
    • Recall: 0.7471590909090909
    • F1: 0.7019572953736656
    • Number: 1056
  • Rna:

    • Precision: 0.589041095890411
    • Recall: 0.7288135593220338
    • F1: 0.6515151515151515
    • Number': 118
  • Cell Line:

    • Precision: 0.4758522727272727
    • Recall: 0.67
    • F1: 0.5564784053156145
    • Number: 500
  • Cell Type:

    • Precision: 0.7294117647058823
    • Recall: 0.7100468505986466
    • F1: 0.7195990503824848
    • Number: 1921
  • Protein:

    • Precision: 0.6657656225155033
    • Recall: 0.8263272153147819
    • F1: 0.7374075378654457
    • Number': 5067
  • Overall

    • Precision: 0.6628
    • Recall: 0.7805
    • F1: 0.7169
    • Accuracy: 0.9367

Model description

For more information on how it was created, check out the following link: https://github.com/DunnBC22/NLP_Projects/blob/main/Token%20Classification/Monolingual/tner-bionlp2004/NER%20Project%20Using%20tner-bionlp%202004%20Dataset%20(BERT-Base).ipynb

Intended uses & limitations

This model is intended to demonstrate my ability to solve a complex problem using technology.

Training and evaluation data

Dataset Source: https://huggingface.co/datasets/tner/bionlp2004

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2e-05
  • train_batch_size: 16
  • eval_batch_size: 16
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 3

Training results

Training Loss Epoch Step Valid. Loss Dna Precision Dna Recall Dna F1 Dna Number Rna Precision Rna Recall Rna F1 Rna Number Cell Line Precision Cell Type Recall Cell Type F1 Cell Type Number Cell Type Protein Precision Protein Recall Protein F1 Protein Number Overall Precision Overall Recall Overall F1 Overall Accuracy
0.1701 1.0 1039 0.1927 0.6153 0.7254 0.6658 1056 0.6617 0.7458 0.7012 118 0.4670 0.608 0.5282 500 0.6997 0.7158 0.7077 1921 0.6603 0.7833 0.7166 5067 0.6499
0.145 2.0 2078 0.1981 0.6364 0.7443 0.6862 1056 0.6408 0.7712 0.7000 118 0.4607 0.668 0.5453 500 0.7376 0.7022 0.7195 1921 0.6759 0.8149 0.7389 5067 0.6662
0.1116 3.0 3117 0.2066 0.6619 0.7472 0.7020 1056 0.5890 0.7288 0.6515 118 0.4759 0.67 0.5565 500 0.7294 0.7100 0.7196 1921 0.6658 0.8263 0.7374 5067 0.6628
  • Metrics shown above are rounded to the neareset ten-thousandth

Framework versions

  • Transformers 4.28.1
  • Pytorch 2.0.0
  • Datasets 2.11.0
  • Tokenizers 0.13.3
Downloads last month
16
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Collections including DunnBC22/bert-base-cased-finetuned-ner-bio_nlp_2004