mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

This model is a fine-tuned version of microsoft/mdeberta-v3-base on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0975
  • Accuracy: 0.9820
  • F1: 0.4359
  • Precision: 0.4857
  • Recall: 0.3953

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 2.5e-05
  • train_batch_size: 4
  • eval_batch_size: 4
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss Accuracy F1 Precision Recall
No log 1.0 250 0.0903 0.9825 0.0 0.0 0.0
0.1391 2.0 500 0.0941 0.9825 0.0 0.0 0.0
0.1391 3.0 750 0.0933 0.9825 0.0 0.0 0.0
0.075 4.0 1000 0.0924 0.9825 0.0 0.0 0.0
0.075 5.0 1250 0.0894 0.9825 0.0 0.0 0.0
0.0634 6.0 1500 0.0870 0.9825 0.0851 0.5 0.0465
0.0634 7.0 1750 0.0846 0.9820 0.0833 0.4 0.0465
0.0508 8.0 2000 0.0799 0.9825 0.1224 0.5 0.0698
0.0508 9.0 2250 0.0794 0.9829 0.125 0.6 0.0698
0.0394 10.0 2500 0.0793 0.9800 0.0755 0.2 0.0465
0.0394 11.0 2750 0.0801 0.9808 0.2034 0.375 0.1395
0.0302 12.0 3000 0.0825 0.9812 0.2069 0.4 0.1395
0.0302 13.0 3250 0.0763 0.9829 0.2759 0.5333 0.1860
0.0232 14.0 3500 0.0755 0.9833 0.3692 0.5455 0.2791
0.0232 15.0 3750 0.0799 0.9829 0.3226 0.5263 0.2326
0.0176 16.0 4000 0.0785 0.9833 0.3692 0.5455 0.2791
0.0176 17.0 4250 0.0776 0.9825 0.3768 0.5 0.3023
0.0132 18.0 4500 0.0803 0.9833 0.3881 0.5417 0.3023
0.0132 19.0 4750 0.0826 0.9812 0.3611 0.4483 0.3023
0.0106 20.0 5000 0.0787 0.9825 0.4110 0.5 0.3488
0.0106 21.0 5250 0.0879 0.9816 0.3478 0.4615 0.2791
0.0085 22.0 5500 0.0848 0.9816 0.4156 0.4706 0.3721
0.0085 23.0 5750 0.0818 0.9825 0.4267 0.5 0.3721
0.0068 24.0 6000 0.0816 0.9833 0.4533 0.5312 0.3953
0.0068 25.0 6250 0.0819 0.9825 0.4267 0.5 0.3721
0.0056 26.0 6500 0.0848 0.9833 0.4533 0.5312 0.3953
0.0056 27.0 6750 0.0872 0.9833 0.4533 0.5312 0.3953
0.0049 28.0 7000 0.0844 0.9837 0.4595 0.5484 0.3953
0.0049 29.0 7250 0.0881 0.9820 0.4211 0.4848 0.3721
0.0042 30.0 7500 0.0925 0.9820 0.45 0.4865 0.4186
0.0042 31.0 7750 0.0924 0.9825 0.4267 0.5 0.3721
0.0038 32.0 8000 0.0938 0.9833 0.4675 0.5294 0.4186
0.0038 33.0 8250 0.0939 0.9825 0.4416 0.5 0.3953
0.0032 34.0 8500 0.0941 0.9833 0.4384 0.5333 0.3721
0.0032 35.0 8750 0.0942 0.9833 0.4675 0.5294 0.4186
0.0029 36.0 9000 0.0949 0.9820 0.4359 0.4857 0.3953
0.0029 37.0 9250 0.0961 0.9820 0.4359 0.4857 0.3953
0.0027 38.0 9500 0.0980 0.9820 0.4359 0.4857 0.3953
0.0027 39.0 9750 0.0972 0.9820 0.4359 0.4857 0.3953
0.0026 40.0 10000 0.0975 0.9820 0.4359 0.4857 0.3953

Framework versions

  • Transformers 4.44.2
  • Pytorch 2.4.1+cu121
  • Datasets 3.1.0
  • Tokenizers 0.19.1
Downloads last month
17
Safetensors
Model size
279M params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for quancute/mdeberta-ner-ghtk-hirach_NER-first_1000_data-3090-15Nov

Finetuned
(215)
this model