do test run on scitas with ref_model
Browse files- README.md +20 -20
- model.safetensors +1 -1
README.md
CHANGED
|
@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
|
|
| 17 |
|
| 18 |
This model is a fine-tuned version of [mNLP-project/gpt2-finetuned](https://huggingface.co/mNLP-project/gpt2-finetuned) on the None dataset.
|
| 19 |
It achieves the following results on the evaluation set:
|
| 20 |
-
- Loss: 1.
|
| 21 |
-
- Rewards/chosen:
|
| 22 |
-
- Rewards/rejected:
|
| 23 |
-
- Rewards/accuracies: 0.
|
| 24 |
-
- Rewards/margins:
|
| 25 |
-
- Logps/rejected: -
|
| 26 |
-
- Logps/chosen: -
|
| 27 |
-
- Logits/rejected: -
|
| 28 |
-
- Logits/chosen: -
|
| 29 |
|
| 30 |
## Model description
|
| 31 |
|
|
@@ -44,7 +44,7 @@ More information needed
|
|
| 44 |
### Training hyperparameters
|
| 45 |
|
| 46 |
The following hyperparameters were used during training:
|
| 47 |
-
- learning_rate:
|
| 48 |
- train_batch_size: 8
|
| 49 |
- eval_batch_size: 8
|
| 50 |
- seed: 42
|
|
@@ -59,16 +59,16 @@ The following hyperparameters were used during training:
|
|
| 59 |
|
| 60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
| 61 |
|:-------------:|:-----:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
| 62 |
-
|
|
| 63 |
-
| 0.
|
| 64 |
-
| 0.
|
| 65 |
-
| 0.
|
| 66 |
-
| 0.
|
| 67 |
-
| 0.
|
| 68 |
-
| 0.
|
| 69 |
-
| 0.
|
| 70 |
-
| 0.
|
| 71 |
-
| 0.
|
| 72 |
|
| 73 |
|
| 74 |
### Framework versions
|
|
|
|
| 17 |
|
| 18 |
This model is a fine-tuned version of [mNLP-project/gpt2-finetuned](https://huggingface.co/mNLP-project/gpt2-finetuned) on the None dataset.
|
| 19 |
It achieves the following results on the evaluation set:
|
| 20 |
+
- Loss: 1.1168
|
| 21 |
+
- Rewards/chosen: 3.8849
|
| 22 |
+
- Rewards/rejected: 3.2031
|
| 23 |
+
- Rewards/accuracies: 0.5892
|
| 24 |
+
- Rewards/margins: 0.6818
|
| 25 |
+
- Logps/rejected: -761.2470
|
| 26 |
+
- Logps/chosen: -910.5992
|
| 27 |
+
- Logits/rejected: -36.5651
|
| 28 |
+
- Logits/chosen: -30.3810
|
| 29 |
|
| 30 |
## Model description
|
| 31 |
|
|
|
|
| 44 |
### Training hyperparameters
|
| 45 |
|
| 46 |
The following hyperparameters were used during training:
|
| 47 |
+
- learning_rate: 1e-05
|
| 48 |
- train_batch_size: 8
|
| 49 |
- eval_batch_size: 8
|
| 50 |
- seed: 42
|
|
|
|
| 59 |
|
| 60 |
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
| 61 |
|:-------------:|:-----:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
| 62 |
+
| 0.9846 | 1.0 | 1337 | 1.1168 | 3.8849 | 3.2031 | 0.5892 | 0.6818 | -761.2470 | -910.5992 | -36.5651 | -30.3810 |
|
| 63 |
+
| 0.6025 | 2.0 | 2674 | 1.1405 | 5.0060 | 4.0992 | 0.6175 | 0.9068 | -752.2864 | -899.3887 | -35.0528 | -28.9839 |
|
| 64 |
+
| 0.2464 | 3.0 | 4011 | 1.1202 | 4.6754 | 3.6835 | 0.6160 | 0.9919 | -756.4427 | -902.6943 | -39.6513 | -33.3219 |
|
| 65 |
+
| 0.1182 | 4.0 | 5348 | 1.3054 | 7.3114 | 5.8367 | 0.6131 | 1.4747 | -734.9108 | -876.3349 | -35.1974 | -28.6005 |
|
| 66 |
+
| 0.0669 | 5.0 | 6685 | 1.3846 | 6.5378 | 5.0738 | 0.6093 | 1.4640 | -742.5399 | -884.0710 | -39.0355 | -31.8814 |
|
| 67 |
+
| 0.0226 | 6.0 | 8022 | 1.4662 | 6.2901 | 4.6812 | 0.6052 | 1.6089 | -746.4659 | -886.5475 | -40.3811 | -32.9593 |
|
| 68 |
+
| 0.0128 | 7.0 | 9359 | 1.5557 | 5.8081 | 4.1554 | 0.6108 | 1.6527 | -751.7241 | -891.3676 | -39.1744 | -31.2704 |
|
| 69 |
+
| 0.019 | 8.0 | 10696 | 1.6676 | 5.5428 | 3.8458 | 0.6011 | 1.6970 | -754.8205 | -894.0207 | -40.5161 | -32.4700 |
|
| 70 |
+
| 0.0101 | 9.0 | 12033 | 1.7100 | 5.5531 | 3.8215 | 0.6022 | 1.7315 | -755.0627 | -893.9178 | -40.7171 | -32.5929 |
|
| 71 |
+
| 0.0053 | 10.0 | 13370 | 1.7177 | 5.4221 | 3.7030 | 0.6000 | 1.7191 | -756.2481 | -895.2274 | -40.8064 | -32.6689 |
|
| 72 |
|
| 73 |
|
| 74 |
### Framework versions
|
model.safetensors
CHANGED
|
@@ -1,3 +1,3 @@
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
-
oid sha256:
|
| 3 |
size 497774208
|
|
|
|
| 1 |
version https://git-lfs.github.com/spec/v1
|
| 2 |
+
oid sha256:dfca1a44eee10523ba16f368b4b7c634d3a5869375730b11248d79f343712a2d
|
| 3 |
size 497774208
|