Mistral-7B-v0.1_cola_rsb_relu
This model is a fine-tuned version of mistralai/Mistral-7B-v0.1 on an unknown dataset. It achieves the following results on the evaluation set:
- Loss: 1.3524
- Accuracy: 0.6577
Model description
More information needed
Intended uses & limitations
More information needed
Training and evaluation data
More information needed
Training procedure
Training hyperparameters
The following hyperparameters were used during training:
- learning_rate: 1e-05
- train_batch_size: 64
- eval_batch_size: 64
- seed: 2
- distributed_type: multi-GPU
- num_devices: 2
- gradient_accumulation_steps: 2
- total_train_batch_size: 256
- total_eval_batch_size: 128
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- training_steps: 750
Training results
Training Loss | Epoch | Step | Validation Loss | Accuracy |
---|---|---|---|---|
2.5204 | 0.33 | 10 | 2.7397 | 0.5503 |
2.415 | 0.66 | 20 | 2.5289 | 0.5475 |
2.208 | 0.98 | 30 | 2.2658 | 0.5417 |
1.4886 | 1.31 | 40 | 1.9140 | 0.5580 |
1.0463 | 1.64 | 50 | 1.5899 | 0.5436 |
0.9207 | 1.97 | 60 | 1.1666 | 0.5916 |
0.6876 | 2.3 | 70 | 0.9105 | 0.6290 |
0.7447 | 2.62 | 80 | 0.7802 | 0.6385 |
0.664 | 2.95 | 90 | 0.7189 | 0.6270 |
0.6109 | 3.28 | 100 | 0.6891 | 0.6539 |
0.5817 | 3.61 | 110 | 0.6786 | 0.6625 |
0.6261 | 3.93 | 120 | 0.6674 | 0.6280 |
0.5419 | 4.26 | 130 | 0.6621 | 0.6577 |
0.5489 | 4.59 | 140 | 0.6757 | 0.6625 |
0.4903 | 4.92 | 150 | 0.6603 | 0.6596 |
0.495 | 5.25 | 160 | 0.6607 | 0.6520 |
0.5505 | 5.57 | 170 | 0.6742 | 0.6309 |
0.5342 | 5.9 | 180 | 0.6733 | 0.6558 |
0.4534 | 6.23 | 190 | 0.6735 | 0.6539 |
0.4728 | 6.56 | 200 | 0.6843 | 0.6577 |
0.4911 | 6.89 | 210 | 0.7004 | 0.6510 |
0.399 | 7.21 | 220 | 0.7090 | 0.6644 |
0.4304 | 7.54 | 230 | 0.7368 | 0.6750 |
0.4325 | 7.87 | 240 | 0.7183 | 0.6616 |
0.3899 | 8.2 | 250 | 0.7379 | 0.6568 |
0.3659 | 8.52 | 260 | 0.7466 | 0.6644 |
0.4117 | 8.85 | 270 | 0.7640 | 0.6529 |
0.3277 | 9.18 | 280 | 0.7780 | 0.6242 |
0.3311 | 9.51 | 290 | 0.7982 | 0.6817 |
0.337 | 9.84 | 300 | 0.8088 | 0.6625 |
0.297 | 10.16 | 310 | 0.8086 | 0.6500 |
0.2856 | 10.49 | 320 | 0.8621 | 0.6644 |
0.2939 | 10.82 | 330 | 0.8666 | 0.6357 |
0.2517 | 11.15 | 340 | 0.9056 | 0.6769 |
0.2726 | 11.48 | 350 | 0.8940 | 0.6318 |
0.2985 | 11.8 | 360 | 0.9314 | 0.6529 |
0.1915 | 12.13 | 370 | 0.9358 | 0.6731 |
0.2081 | 12.46 | 380 | 0.9824 | 0.6405 |
0.2147 | 12.79 | 390 | 1.0395 | 0.6577 |
0.1644 | 13.11 | 400 | 1.0678 | 0.6606 |
0.154 | 13.44 | 410 | 1.1975 | 0.6462 |
0.1734 | 13.77 | 420 | 1.2171 | 0.6510 |
0.1008 | 14.1 | 430 | 1.2565 | 0.6596 |
0.1104 | 14.43 | 440 | 1.3389 | 0.6318 |
0.1125 | 14.75 | 450 | 1.3177 | 0.6500 |
0.067 | 15.08 | 460 | 1.4552 | 0.6520 |
0.0897 | 15.41 | 470 | 1.4729 | 0.6548 |
0.0945 | 15.74 | 480 | 1.4543 | 0.6500 |
0.0602 | 16.07 | 490 | 1.6014 | 0.6491 |
0.0525 | 16.39 | 500 | 1.6417 | 0.6337 |
0.0701 | 16.72 | 510 | 1.5756 | 0.6414 |
0.0413 | 17.05 | 520 | 1.6458 | 0.6491 |
0.0885 | 17.38 | 530 | 1.6182 | 0.6443 |
0.0318 | 17.7 | 540 | 1.6987 | 0.6405 |
0.0498 | 18.03 | 550 | 1.6695 | 0.6596 |
0.0531 | 18.36 | 560 | 1.7738 | 0.6347 |
Framework versions
- Transformers 4.35.2
- Pytorch 2.1.1+cu121
- Datasets 2.15.0
- Tokenizers 0.15.0
Model tree for thrunlab/Mistral-7B-v0.1_cola_rsb_relu
Base model
mistralai/Mistral-7B-v0.1