qa-persian-distilbert-fa-zwnj-base

This model is a fine-tuned version of makhataei/qa-persian-distilbert-fa-zwnj-base on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 5.3843

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 6.25e-09
  • train_batch_size: 14
  • eval_batch_size: 14
  • seed: 42
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: linear
  • num_epochs: 100

Training results

Training Loss Epoch Step Validation Loss
5.4975 1.0 9 5.3843
5.6974 2.0 18 5.3843
5.681 3.0 27 5.3843
5.7298 4.0 36 5.3843
5.7675 5.0 45 5.3843
5.7265 6.0 54 5.3843
5.6502 7.0 63 5.3843
5.6803 8.0 72 5.3843
5.6433 9.0 81 5.3843
5.6107 10.0 90 5.3843
5.5624 11.0 99 5.3843
5.6151 12.0 108 5.3843
5.6815 13.0 117 5.3843
5.6993 14.0 126 5.3843
5.6933 15.0 135 5.3843
5.7421 16.0 144 5.3843
5.7573 17.0 153 5.3843
5.7137 18.0 162 5.3843
5.7891 19.0 171 5.3843
5.7035 20.0 180 5.3843
5.6504 21.0 189 5.3843
5.7166 22.0 198 5.3843
5.6868 23.0 207 5.3843
5.7905 24.0 216 5.3843
5.7363 25.0 225 5.3843
5.7459 26.0 234 5.3843
5.7354 27.0 243 5.3843
5.7545 28.0 252 5.3843
5.6522 29.0 261 5.3843
5.6467 30.0 270 5.3843
5.7483 31.0 279 5.3843
5.7255 32.0 288 5.3843
5.6064 33.0 297 5.3843
5.6728 34.0 306 5.3843
5.6922 35.0 315 5.3843
5.6817 36.0 324 5.3843
5.6892 37.0 333 5.3843
5.609 38.0 342 5.3843
5.6179 39.0 351 5.3843
5.6384 40.0 360 5.3843
5.6311 41.0 369 5.3843
5.5614 42.0 378 5.3843
5.4875 43.0 387 5.3843
5.5113 44.0 396 5.3843
5.4597 45.0 405 5.3843
5.7105 46.0 414 5.3843
5.5722 47.0 423 5.3843
5.4466 48.0 432 5.3843
5.3902 49.0 441 5.3843
5.5197 50.0 450 5.3843
5.4349 51.0 459 5.3843
5.4746 52.0 468 5.3843
5.5058 53.0 477 5.3843
5.5615 54.0 486 5.3843
5.5838 55.0 495 5.3843
5.6564 56.0 504 5.3843
5.6402 57.0 513 5.3843
5.6022 58.0 522 5.3843
5.6428 59.0 531 5.3843
5.6259 60.0 540 5.3843
5.6678 61.0 549 5.3843
5.6119 62.0 558 5.3843
5.614 63.0 567 5.3843
5.6349 64.0 576 5.3843
5.5935 65.0 585 5.3843
5.7087 66.0 594 5.3843
5.6243 67.0 603 5.3843
5.6718 68.0 612 5.3843
5.5945 69.0 621 5.3843
5.6609 70.0 630 5.3843
5.7069 71.0 639 5.3843
5.6578 72.0 648 5.3843
5.706 73.0 657 5.3843
5.7486 74.0 666 5.3843
5.5958 75.0 675 5.3843
5.6005 76.0 684 5.3843
5.6954 77.0 693 5.3843
5.6576 78.0 702 5.3843
5.6537 79.0 711 5.3843
5.6949 80.0 720 5.3843
5.7134 81.0 729 5.3843
5.7391 82.0 738 5.3843
5.5262 83.0 747 5.3843
5.7075 84.0 756 5.3843
5.6827 85.0 765 5.3843
5.6573 86.0 774 5.3843
5.738 87.0 783 5.3843
5.7347 88.0 792 5.3843
5.6938 89.0 801 5.3843
5.7081 90.0 810 5.3843
5.7208 91.0 819 5.3843
5.7367 92.0 828 5.3843
5.7761 93.0 837 5.3843
5.7187 94.0 846 5.3843
5.7559 95.0 855 5.3843
5.7001 96.0 864 5.3843
5.7402 97.0 873 5.3843
5.6641 98.0 882 5.3843
5.7209 99.0 891 5.3843
5.7791 100.0 900 5.3843

Framework versions

  • Transformers 4.35.2
  • Pytorch 2.0.1+cu117
  • Datasets 2.15.0
  • Tokenizers 0.15.0
Downloads last month
24
Safetensors
Model size
75.2M params
Tensor type
F32
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for makhataei/qa-persian-distilbert-fa-zwnj-base

Unable to build the model tree, the base model loops to the model itself. Learn more.