tinyllama-icd_qa_5q_all_or_nothing

This model is a fine-tuned version of TinyLlama/TinyLlama-1.1B-intermediate-step-1431k-3T on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 1.0245

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 5e-05
  • train_batch_size: 1
  • eval_batch_size: 1
  • seed: 42
  • gradient_accumulation_steps: 2
  • total_train_batch_size: 2
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 5
  • mixed_precision_training: Native AMP

Training results

Training Loss Epoch Step Validation Loss
1.897 0.0421 1000 2.0673
1.8168 0.0842 2000 1.8269
1.8585 0.1264 3000 1.7420
1.4869 0.1685 4000 1.6427
1.4204 0.2106 5000 1.5763
1.2253 0.2527 6000 1.5298
1.326 0.2948 7000 1.4959
1.5507 0.3370 8000 1.4615
1.2435 0.3791 9000 1.4142
1.5801 0.4212 10000 1.3806
1.1879 0.4633 11000 1.3511
1.2121 0.5054 12000 1.3304
1.0262 0.5476 13000 1.3335
1.0754 0.5897 14000 1.2921
1.0264 0.6318 15000 1.2785
1.0067 0.6739 16000 1.2544
1.1532 0.7160 17000 1.2323
1.1084 0.7582 18000 1.2354
1.1094 0.8003 19000 1.2106
0.9589 0.8424 20000 1.1987
0.9922 0.8845 21000 1.1946
1.0851 0.9266 22000 1.1723
1.0407 0.9688 23000 1.1528
0.7835 1.0109 24000 1.1521
0.9736 1.0530 25000 1.1451
0.9463 1.0951 26000 1.1424
0.9879 1.1372 27000 1.1394
0.8803 1.1794 28000 1.1301
0.9309 1.2215 29000 1.1380
0.9263 1.2636 30000 1.1157
0.975 1.3057 31000 1.1028
0.7661 1.3479 32000 1.0962
0.8526 1.3900 33000 1.0979
0.9277 1.4321 34000 1.0948
0.8177 1.4742 35000 1.0887
0.8935 1.5163 36000 1.0810
0.8534 1.5585 37000 1.0868
0.8604 1.6006 38000 1.0673
0.8426 1.6427 39000 1.0699
0.9027 1.6848 40000 1.0588
0.8062 1.7269 41000 1.0487
1.0168 1.7691 42000 1.0508
0.8437 1.8112 43000 1.0416
0.9178 1.8533 44000 1.0256
0.9543 1.8954 45000 1.0266
0.787 1.9375 46000 1.0247
0.7192 1.9797 47000 1.0170
0.8496 2.0218 48000 1.0374
0.7649 2.0639 49000 1.0333
0.7686 2.1060 50000 1.0282
0.6953 2.1481 51000 1.0420
0.8024 2.1903 52000 1.0333
0.823 2.2324 53000 1.0265
0.6479 2.2745 54000 1.0156
0.7726 2.3166 55000 1.0142
0.7353 2.3587 56000 1.0093
0.6597 2.4009 57000 1.0133
0.8428 2.4430 58000 1.0154
0.7129 2.4851 59000 1.0113
0.6315 2.5272 60000 1.0110
0.7019 2.5693 61000 1.0101
0.722 2.6115 62000 0.9969
0.8638 2.6536 63000 0.9973
0.9256 2.6957 64000 0.9930
0.6812 2.7378 65000 0.9942
0.7772 2.7799 66000 0.9978
0.6935 2.8221 67000 0.9807
0.7865 2.8642 68000 0.9797
0.758 2.9063 69000 0.9857
0.8565 2.9484 70000 0.9729
0.7374 2.9905 71000 0.9722
0.6262 3.0327 72000 1.0030
0.6128 3.0748 73000 1.0083
0.587 3.1169 74000 1.0000
0.6267 3.1590 75000 1.0184
0.5878 3.2011 76000 1.0107
0.6382 3.2433 77000 1.0090
0.738 3.2854 78000 1.0005
0.6962 3.3275 79000 1.0091
0.6249 3.3696 80000 1.0081
0.6458 3.4117 81000 1.0061
0.6177 3.4539 82000 1.0010
0.6046 3.4960 83000 1.0046
0.6263 3.5381 84000 1.0010
0.6269 3.5802 85000 0.9995
0.6503 3.6223 86000 1.0010
0.6702 3.6645 87000 0.9906
0.6865 3.7066 88000 0.9858
0.5789 3.7487 89000 0.9858
0.6636 3.7908 90000 0.9817
0.622 3.8330 91000 0.9871
0.5741 3.8751 92000 0.9849
0.6681 3.9172 93000 0.9766
0.6471 3.9593 94000 0.9739
0.5567 4.0014 95000 0.9776
0.5318 4.0436 96000 1.0329
0.5967 4.0857 97000 1.0379
0.5666 4.1278 98000 1.0377
0.5573 4.1699 99000 1.0397
0.533 4.2120 100000 1.0324
0.5331 4.2542 101000 1.0346
0.5689 4.2963 102000 1.0376
0.5983 4.3384 103000 1.0354
0.5405 4.3805 104000 1.0281
0.5718 4.4226 105000 1.0357
0.5416 4.4648 106000 1.0303
0.5482 4.5069 107000 1.0312
0.5459 4.5490 108000 1.0268
0.563 4.5911 109000 1.0300
0.549 4.6332 110000 1.0277
0.5049 4.6754 111000 1.0290
0.593 4.7175 112000 1.0259
0.5144 4.7596 113000 1.0240
0.6079 4.8017 114000 1.0242
0.4864 4.8438 115000 1.0257
0.5388 4.8860 116000 1.0257
0.5368 4.9281 117000 1.0264
0.4607 4.9702 118000 1.0245

Framework versions

  • Transformers 4.46.2
  • Pytorch 2.5.1+cu124
  • Datasets 3.1.0
  • Tokenizers 0.20.3
Downloads last month
16
Safetensors
Model size
1.1B params
Tensor type
F32
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for 2ndBestKiller/tinyllama-icd_qa_5q_all_or_nothing

Finetuned
(89)
this model
Quantizations
1 model