Luca-Engel commited on
Commit
505e9ee
·
verified ·
1 Parent(s): 192c42c

do test run on scitas with ref_model

Browse files
Files changed (2) hide show
  1. README.md +20 -20
  2. model.safetensors +1 -1
README.md CHANGED
@@ -17,15 +17,15 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [mNLP-project/gpt2-finetuned](https://huggingface.co/mNLP-project/gpt2-finetuned) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 1.5027
21
- - Rewards/chosen: 7.0965
22
- - Rewards/rejected: 5.7124
23
- - Rewards/accuracies: 0.6101
24
- - Rewards/margins: 1.3842
25
- - Logps/rejected: -736.1544
26
- - Logps/chosen: -878.4832
27
- - Logits/rejected: -37.8324
28
- - Logits/chosen: -32.9004
29
 
30
  ## Model description
31
 
@@ -44,7 +44,7 @@ More information needed
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
- - learning_rate: 3e-05
48
  - train_batch_size: 8
49
  - eval_batch_size: 8
50
  - seed: 42
@@ -59,16 +59,16 @@ The following hyperparameters were used during training:
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
- | 1.2678 | 1.0 | 1337 | 1.5076 | 4.1438 | 3.2671 | 0.5687 | 0.8767 | -760.6065 | -908.0106 | -41.6142 | -35.6298 |
63
- | 0.8767 | 2.0 | 2674 | 1.5027 | 7.0965 | 5.7124 | 0.6101 | 1.3842 | -736.1544 | -878.4832 | -37.8324 | -32.9004 |
64
- | 0.431 | 3.0 | 4011 | 1.5905 | 6.0978 | 4.7517 | 0.5929 | 1.3462 | -745.7613 | -888.4703 | -38.2186 | -32.5863 |
65
- | 0.1242 | 4.0 | 5348 | 1.7672 | 8.8069 | 6.9080 | 0.6138 | 1.8988 | -724.1977 | -861.3801 | -35.3133 | -29.3914 |
66
- | 0.0166 | 5.0 | 6685 | 1.9424 | 8.8192 | 6.7038 | 0.6011 | 2.1154 | -726.2397 | -861.2565 | -39.2436 | -33.0524 |
67
- | 0.0031 | 6.0 | 8022 | 2.0099 | 7.4468 | 5.2575 | 0.6071 | 2.1894 | -740.7034 | -874.9804 | -39.1570 | -32.4214 |
68
- | 0.0111 | 7.0 | 9359 | 2.0798 | 6.9472 | 4.8187 | 0.6004 | 2.1285 | -745.0905 | -879.9766 | -39.8656 | -32.8297 |
69
- | 0.0176 | 8.0 | 10696 | 2.1751 | 6.9736 | 4.7371 | 0.6034 | 2.2364 | -745.9068 | -879.7130 | -39.2893 | -32.1535 |
70
- | 0.0089 | 9.0 | 12033 | 2.2161 | 6.6595 | 4.4256 | 0.6019 | 2.2339 | -749.0217 | -882.8531 | -39.3982 | -32.1973 |
71
- | 0.0045 | 10.0 | 13370 | 2.2229 | 6.5755 | 4.3480 | 0.6007 | 2.2275 | -749.7980 | -883.6937 | -39.5822 | -32.3730 |
72
 
73
 
74
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [mNLP-project/gpt2-finetuned](https://huggingface.co/mNLP-project/gpt2-finetuned) on the None dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 1.1168
21
+ - Rewards/chosen: 3.8849
22
+ - Rewards/rejected: 3.2031
23
+ - Rewards/accuracies: 0.5892
24
+ - Rewards/margins: 0.6818
25
+ - Logps/rejected: -761.2470
26
+ - Logps/chosen: -910.5992
27
+ - Logits/rejected: -36.5651
28
+ - Logits/chosen: -30.3810
29
 
30
  ## Model description
31
 
 
44
  ### Training hyperparameters
45
 
46
  The following hyperparameters were used during training:
47
+ - learning_rate: 1e-05
48
  - train_batch_size: 8
49
  - eval_batch_size: 8
50
  - seed: 42
 
59
 
60
  | Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
61
  |:-------------:|:-----:|:-----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
62
+ | 0.9846 | 1.0 | 1337 | 1.1168 | 3.8849 | 3.2031 | 0.5892 | 0.6818 | -761.2470 | -910.5992 | -36.5651 | -30.3810 |
63
+ | 0.6025 | 2.0 | 2674 | 1.1405 | 5.0060 | 4.0992 | 0.6175 | 0.9068 | -752.2864 | -899.3887 | -35.0528 | -28.9839 |
64
+ | 0.2464 | 3.0 | 4011 | 1.1202 | 4.6754 | 3.6835 | 0.6160 | 0.9919 | -756.4427 | -902.6943 | -39.6513 | -33.3219 |
65
+ | 0.1182 | 4.0 | 5348 | 1.3054 | 7.3114 | 5.8367 | 0.6131 | 1.4747 | -734.9108 | -876.3349 | -35.1974 | -28.6005 |
66
+ | 0.0669 | 5.0 | 6685 | 1.3846 | 6.5378 | 5.0738 | 0.6093 | 1.4640 | -742.5399 | -884.0710 | -39.0355 | -31.8814 |
67
+ | 0.0226 | 6.0 | 8022 | 1.4662 | 6.2901 | 4.6812 | 0.6052 | 1.6089 | -746.4659 | -886.5475 | -40.3811 | -32.9593 |
68
+ | 0.0128 | 7.0 | 9359 | 1.5557 | 5.8081 | 4.1554 | 0.6108 | 1.6527 | -751.7241 | -891.3676 | -39.1744 | -31.2704 |
69
+ | 0.019 | 8.0 | 10696 | 1.6676 | 5.5428 | 3.8458 | 0.6011 | 1.6970 | -754.8205 | -894.0207 | -40.5161 | -32.4700 |
70
+ | 0.0101 | 9.0 | 12033 | 1.7100 | 5.5531 | 3.8215 | 0.6022 | 1.7315 | -755.0627 | -893.9178 | -40.7171 | -32.5929 |
71
+ | 0.0053 | 10.0 | 13370 | 1.7177 | 5.4221 | 3.7030 | 0.6000 | 1.7191 | -756.2481 | -895.2274 | -40.8064 | -32.6689 |
72
 
73
 
74
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:26ab6777efa1486fbedaa67fd9f8ffc5db2b450dd61768a689397ed08eafa178
3
  size 497774208
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:dfca1a44eee10523ba16f368b4b7c634d3a5869375730b11248d79f343712a2d
3
  size 497774208