imdatta0's picture
End of training
549068b verified
|
raw
history blame
3.89 kB
metadata
base_model: unsloth/mistral-7b-v0.3-bnb-4bit
library_name: peft
license: apache-2.0
tags:
  - unsloth
  - generated_from_trainer
model-index:
  - name: Mistral-7B-v0.3_metamath_reverse
    results: []

Mistral-7B-v0.3_metamath_reverse

This model is a fine-tuned version of unsloth/mistral-7b-v0.3-bnb-4bit on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 4.0369

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 8
  • eval_batch_size: 8
  • seed: 42
  • gradient_accumulation_steps: 8
  • total_train_batch_size: 64
  • optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
  • lr_scheduler_type: cosine
  • lr_scheduler_warmup_ratio: 0.02
  • num_epochs: 1

Training results

Training Loss Epoch Step Validation Loss
0.7562 0.0211 13 8.8561
8.5861 0.0421 26 6.7147
6.683 0.0632 39 6.4347
6.3623 0.0842 52 6.2959
6.1966 0.1053 65 6.1023
5.9253 0.1264 78 5.8562
5.6996 0.1474 91 5.7402
5.654 0.1685 104 5.5460
5.4346 0.1896 117 5.3902
5.2399 0.2106 130 5.1306
5.1411 0.2317 143 5.0223
5.0468 0.2527 156 4.9554
4.9675 0.2738 169 4.8488
4.8723 0.2949 182 4.9092
4.9509 0.3159 195 4.6985
4.7385 0.3370 208 4.7031
4.631 0.3580 221 4.6471
4.6294 0.3791 234 4.6124
4.5562 0.4002 247 4.5880
4.5684 0.4212 260 4.5116
4.5965 0.4423 273 4.5065
4.594 0.4633 286 4.4330
4.5223 0.4844 299 4.4393
4.4033 0.5055 312 4.4070
4.3706 0.5265 325 4.3485
4.3595 0.5476 338 4.3587
4.3865 0.5687 351 4.2940
4.342 0.5897 364 4.3082
4.2976 0.6108 377 4.2683
4.3627 0.6318 390 4.2331
4.2364 0.6529 403 4.2331
4.1543 0.6740 416 4.1827
4.2475 0.6950 429 4.2243
4.2247 0.7161 442 4.1690
4.1115 0.7371 455 4.1257
4.1388 0.7582 468 4.1157
4.0912 0.7793 481 4.1659
4.0903 0.8003 494 4.0926
4.1036 0.8214 507 4.0859
4.0692 0.8424 520 4.0732
4.0634 0.8635 533 4.0823
4.0463 0.8846 546 4.0597
4.0948 0.9056 559 4.0447
4.0496 0.9267 572 4.0293
3.9855 0.9478 585 4.0449
4.0289 0.9688 598 4.0360
4.0147 0.9899 611 4.0369

Framework versions

  • PEFT 0.12.0
  • Transformers 4.44.0
  • Pytorch 2.4.0+cu121
  • Datasets 2.20.0
  • Tokenizers 0.19.1