Mistral-7B-v0.1_cola_silu
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 0.3753
- Accuracy: 0.8575
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 2
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 2
- total_train_batch_size: 256
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 750
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
1.8023 | 0.33 | 10 | 1.7689 | 0.5465 |
1.3283 | 0.66 | 20 | 1.4353 | 0.5954 |
1.1049 | 0.98 | 30 | 1.1974 | 0.6328 |
0.8542 | 1.31 | 40 | 0.9991 | 0.6337 |
0.7607 | 1.64 | 50 | 0.7994 | 0.6711 |
0.661 | 1.97 | 60 | 0.6935 | 0.7354 |
0.5306 | 2.3 | 70 | 0.5793 | 0.7661 |
0.6456 | 2.62 | 80 | 0.5240 | 0.7900 |
0.5685 | 2.95 | 90 | 0.4889 | 0.7996 |
0.4556 | 3.28 | 100 | 0.4674 | 0.7967 |
0.6054 | 3.61 | 110 | 0.4582 | 0.8102 |
0.3784 | 3.93 | 120 | 0.4379 | 0.8121 |
0.3644 | 4.26 | 130 | 0.4198 | 0.8130 |
0.4346 | 4.59 | 140 | 0.4197 | 0.8178 |
0.4284 | 4.92 | 150 | 0.4075 | 0.8255 |
0.3597 | 5.25 | 160 | 0.3816 | 0.8380 |
0.3898 | 5.57 | 170 | 0.3905 | 0.8293 |
0.3815 | 5.9 | 180 | 0.3902 | 0.8351 |
0.307 | 6.23 | 190 | 0.3756 | 0.8418 |
0.3967 | 6.56 | 200 | 0.3706 | 0.8418 |
0.3673 | 6.89 | 210 | 0.3767 | 0.8428 |
0.3036 | 7.21 | 220 | 0.4043 | 0.8236 |
0.2788 | 7.54 | 230 | 0.3686 | 0.8447 |
0.3103 | 7.87 | 240 | 0.3909 | 0.8245 |
0.3109 | 8.2 | 250 | 0.3682 | 0.8408 |
0.3152 | 8.52 | 260 | 0.3627 | 0.8408 |
0.3514 | 8.85 | 270 | 0.3599 | 0.8456 |
0.2984 | 9.18 | 280 | 0.4051 | 0.8332 |
0.3046 | 9.51 | 290 | 0.3775 | 0.8370 |
0.2388 | 9.84 | 300 | 0.3745 | 0.8360 |
0.2641 | 10.16 | 310 | 0.3593 | 0.8408 |
0.2567 | 10.49 | 320 | 0.3537 | 0.8485 |
0.2414 | 10.82 | 330 | 0.3655 | 0.8466 |
0.2385 | 11.15 | 340 | 0.3656 | 0.8476 |
0.3334 | 11.48 | 350 | 0.3617 | 0.8504 |
0.2424 | 11.8 | 360 | 0.3520 | 0.8485 |
0.3105 | 12.13 | 370 | 0.3608 | 0.8447 |
0.2301 | 12.46 | 380 | 0.3510 | 0.8552 |
0.2869 | 12.79 | 390 | 0.3541 | 0.8447 |
0.2197 | 13.11 | 400 | 0.3518 | 0.8514 |
0.2533 | 13.44 | 410 | 0.3588 | 0.8466 |
0.2426 | 13.77 | 420 | 0.3545 | 0.8581 |
0.2097 | 14.1 | 430 | 0.3649 | 0.8456 |
0.1854 | 14.43 | 440 | 0.3858 | 0.8504 |
0.1908 | 14.75 | 450 | 0.3636 | 0.8466 |
0.1651 | 15.08 | 460 | 0.3692 | 0.8504 |
0.1954 | 15.41 | 470 | 0.3866 | 0.8514 |
0.1904 | 15.74 | 480 | 0.3763 | 0.8514 |
0.1437 | 16.07 | 490 | 0.3838 | 0.8514 |
0.1535 | 16.39 | 500 | 0.3889 | 0.8485 |
0.1086 | 16.72 | 510 | 0.3960 | 0.8514 |
0.1327 | 17.05 | 520 | 0.4210 | 0.8447 |
0.1104 | 17.38 | 530 | 0.4173 | 0.8408 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0
Model tree for thrunlab/Mistral-7B-v0.1_cola_silu
Base model
mistralai/Mistral-7B-v0.1