Model save
Browse files- README.md +114 -0
- adapter_model.safetensors +1 -1
- all_results.json +8 -0
- runs/Jul22_23-14-44_node26/events.out.tfevents.1721657973.node26.3249241.0 +2 -2
- train_results.json +8 -0
- trainer_state.json +0 -0
README.md
ADDED
@@ -0,0 +1,114 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: apache-2.0
|
3 |
+
library_name: peft
|
4 |
+
tags:
|
5 |
+
- trl
|
6 |
+
- dpo
|
7 |
+
- generated_from_trainer
|
8 |
+
base_model: alignment-handbook/zephyr-7b-sft-full
|
9 |
+
model-index:
|
10 |
+
- name: zephyr-7b-dpo-uffull-qlora-5e-7
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# zephyr-7b-dpo-uffull-qlora-5e-7
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.5925
|
22 |
+
- Rewards/chosen: -0.2517
|
23 |
+
- Rewards/rejected: -0.6019
|
24 |
+
- Rewards/accuracies: 0.7341
|
25 |
+
- Rewards/margins: 0.3502
|
26 |
+
- Rewards/margins Max: 1.2316
|
27 |
+
- Rewards/margins Min: -0.5546
|
28 |
+
- Rewards/margins Std: 0.6038
|
29 |
+
- Logps/rejected: -322.3433
|
30 |
+
- Logps/chosen: -309.6684
|
31 |
+
- Logits/rejected: -2.6801
|
32 |
+
- Logits/chosen: -2.7126
|
33 |
+
|
34 |
+
## Model description
|
35 |
+
|
36 |
+
More information needed
|
37 |
+
|
38 |
+
## Intended uses & limitations
|
39 |
+
|
40 |
+
More information needed
|
41 |
+
|
42 |
+
## Training and evaluation data
|
43 |
+
|
44 |
+
More information needed
|
45 |
+
|
46 |
+
## Training procedure
|
47 |
+
|
48 |
+
### Training hyperparameters
|
49 |
+
|
50 |
+
The following hyperparameters were used during training:
|
51 |
+
- learning_rate: 5e-07
|
52 |
+
- train_batch_size: 4
|
53 |
+
- eval_batch_size: 8
|
54 |
+
- seed: 42
|
55 |
+
- distributed_type: multi-GPU
|
56 |
+
- num_devices: 4
|
57 |
+
- total_train_batch_size: 16
|
58 |
+
- total_eval_batch_size: 32
|
59 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
60 |
+
- lr_scheduler_type: cosine
|
61 |
+
- lr_scheduler_warmup_ratio: 0.1
|
62 |
+
- num_epochs: 1
|
63 |
+
|
64 |
+
### Training results
|
65 |
+
|
66 |
+
| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
67 |
+
|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
68 |
+
| 0.6929 | 0.03 | 100 | 0.6930 | 0.0001 | -0.0003 | 0.5377 | 0.0004 | 0.0054 | -0.0041 | 0.0032 | -262.1841 | -284.4886 | -2.7819 | -2.8200 |
|
69 |
+
| 0.6922 | 0.05 | 200 | 0.6923 | 0.0008 | -0.0010 | 0.6627 | 0.0019 | 0.0100 | -0.0058 | 0.0051 | -262.2543 | -284.4120 | -2.7814 | -2.8195 |
|
70 |
+
| 0.6908 | 0.08 | 300 | 0.6903 | 0.0041 | -0.0025 | 0.7143 | 0.0066 | 0.0281 | -0.0141 | 0.0137 | -262.3995 | -284.0884 | -2.7806 | -2.8185 |
|
71 |
+
| 0.689 | 0.1 | 400 | 0.6870 | 0.0093 | -0.0046 | 0.7183 | 0.0140 | 0.0586 | -0.0282 | 0.0285 | -262.6125 | -283.5621 | -2.7783 | -2.8162 |
|
72 |
+
| 0.6813 | 0.13 | 500 | 0.6813 | 0.0235 | -0.0040 | 0.7242 | 0.0275 | 0.1137 | -0.0534 | 0.0551 | -262.5450 | -282.1426 | -2.7758 | -2.8132 |
|
73 |
+
| 0.6712 | 0.16 | 600 | 0.6742 | 0.0200 | -0.0247 | 0.7262 | 0.0447 | 0.1814 | -0.0859 | 0.0884 | -264.6151 | -282.4901 | -2.7638 | -2.8015 |
|
74 |
+
| 0.6643 | 0.18 | 700 | 0.6653 | 0.0004 | -0.0668 | 0.7242 | 0.0672 | 0.2707 | -0.1305 | 0.1329 | -268.8295 | -284.4591 | -2.7558 | -2.7925 |
|
75 |
+
| 0.6421 | 0.21 | 800 | 0.6562 | -0.0231 | -0.1154 | 0.7222 | 0.0923 | 0.3706 | -0.1761 | 0.1820 | -273.6847 | -286.8017 | -2.7519 | -2.7880 |
|
76 |
+
| 0.648 | 0.24 | 900 | 0.6480 | -0.0748 | -0.1938 | 0.7183 | 0.1190 | 0.4823 | -0.2242 | 0.2359 | -281.5314 | -291.9791 | -2.7477 | -2.7835 |
|
77 |
+
| 0.6547 | 0.26 | 1000 | 0.6378 | -0.0763 | -0.2278 | 0.7183 | 0.1515 | 0.5995 | -0.2816 | 0.2954 | -284.9341 | -292.1262 | -2.7446 | -2.7798 |
|
78 |
+
| 0.6408 | 0.29 | 1100 | 0.6317 | -0.0432 | -0.2136 | 0.7262 | 0.1704 | 0.6414 | -0.2953 | 0.3163 | -283.5132 | -288.8173 | -2.7545 | -2.7885 |
|
79 |
+
| 0.6358 | 0.31 | 1200 | 0.6260 | -0.0529 | -0.2480 | 0.7183 | 0.1952 | 0.7219 | -0.3249 | 0.3520 | -286.9514 | -289.7809 | -2.7585 | -2.7914 |
|
80 |
+
| 0.6297 | 0.34 | 1300 | 0.6215 | -0.1213 | -0.3378 | 0.7143 | 0.2165 | 0.8114 | -0.3727 | 0.4028 | -295.9312 | -296.6275 | -2.7489 | -2.7816 |
|
81 |
+
| 0.6165 | 0.37 | 1400 | 0.6213 | -0.2177 | -0.4420 | 0.7103 | 0.2243 | 0.8626 | -0.4022 | 0.4264 | -306.3474 | -306.2648 | -2.7404 | -2.7733 |
|
82 |
+
| 0.6185 | 0.39 | 1500 | 0.6162 | -0.1021 | -0.3356 | 0.7063 | 0.2335 | 0.8779 | -0.3976 | 0.4349 | -295.7101 | -294.7082 | -2.7425 | -2.7745 |
|
83 |
+
| 0.6066 | 0.42 | 1600 | 0.6141 | -0.1696 | -0.4256 | 0.7123 | 0.2560 | 0.9394 | -0.4398 | 0.4678 | -304.7078 | -301.4554 | -2.7367 | -2.7689 |
|
84 |
+
| 0.6048 | 0.44 | 1700 | 0.6123 | -0.1220 | -0.3748 | 0.7123 | 0.2529 | 0.9411 | -0.4235 | 0.4656 | -299.6321 | -296.6920 | -2.7315 | -2.7638 |
|
85 |
+
| 0.609 | 0.47 | 1800 | 0.6090 | -0.1424 | -0.4122 | 0.7282 | 0.2698 | 0.9829 | -0.4478 | 0.4813 | -303.3703 | -298.7344 | -2.7251 | -2.7574 |
|
86 |
+
| 0.5909 | 0.5 | 1900 | 0.6062 | -0.2373 | -0.5239 | 0.7183 | 0.2866 | 1.0475 | -0.4860 | 0.5181 | -314.5422 | -308.2264 | -2.7186 | -2.7507 |
|
87 |
+
| 0.6011 | 0.52 | 2000 | 0.6048 | -0.1288 | -0.4109 | 0.7242 | 0.2821 | 1.0037 | -0.4627 | 0.4932 | -303.2409 | -297.3789 | -2.7100 | -2.7425 |
|
88 |
+
| 0.6047 | 0.55 | 2100 | 0.6031 | -0.1486 | -0.4420 | 0.7262 | 0.2934 | 1.0559 | -0.4792 | 0.5193 | -306.3505 | -299.3512 | -2.7123 | -2.7448 |
|
89 |
+
| 0.592 | 0.58 | 2200 | 0.6011 | -0.2623 | -0.5777 | 0.7242 | 0.3154 | 1.1326 | -0.5284 | 0.5638 | -319.9217 | -310.7270 | -2.7100 | -2.7423 |
|
90 |
+
| 0.6285 | 0.6 | 2300 | 0.6022 | -0.3099 | -0.6207 | 0.7242 | 0.3108 | 1.1254 | -0.5181 | 0.5570 | -324.2166 | -315.4819 | -2.7044 | -2.7370 |
|
91 |
+
| 0.6258 | 0.63 | 2400 | 0.6005 | -0.1642 | -0.4737 | 0.7302 | 0.3095 | 1.0716 | -0.4957 | 0.5259 | -309.5165 | -300.9170 | -2.6960 | -2.7291 |
|
92 |
+
| 0.5855 | 0.65 | 2500 | 0.5981 | -0.2145 | -0.5381 | 0.7341 | 0.3237 | 1.1337 | -0.5235 | 0.5568 | -315.9617 | -305.9418 | -2.6924 | -2.7253 |
|
93 |
+
| 0.6095 | 0.68 | 2600 | 0.5970 | -0.2416 | -0.5724 | 0.7262 | 0.3308 | 1.1753 | -0.5364 | 0.5756 | -319.3885 | -308.6579 | -2.6859 | -2.7187 |
|
94 |
+
| 0.6013 | 0.71 | 2700 | 0.5961 | -0.2450 | -0.5789 | 0.7262 | 0.3340 | 1.1924 | -0.5460 | 0.5830 | -320.0433 | -308.9903 | -2.6845 | -2.7170 |
|
95 |
+
| 0.6233 | 0.73 | 2800 | 0.5954 | -0.2426 | -0.5787 | 0.7302 | 0.3361 | 1.2015 | -0.5491 | 0.5882 | -320.0177 | -308.7550 | -2.6852 | -2.7174 |
|
96 |
+
| 0.6119 | 0.76 | 2900 | 0.5944 | -0.2613 | -0.6032 | 0.7282 | 0.3419 | 1.2206 | -0.5595 | 0.6006 | -322.4701 | -310.6289 | -2.6853 | -2.7176 |
|
97 |
+
| 0.5644 | 0.79 | 3000 | 0.5938 | -0.2218 | -0.5648 | 0.7282 | 0.3430 | 1.1989 | -0.5312 | 0.5872 | -318.6263 | -306.6716 | -2.6826 | -2.7150 |
|
98 |
+
| 0.5946 | 0.81 | 3100 | 0.5932 | -0.2763 | -0.6239 | 0.7262 | 0.3476 | 1.2359 | -0.5639 | 0.6094 | -324.5376 | -312.1256 | -2.6762 | -2.7090 |
|
99 |
+
| 0.5961 | 0.84 | 3200 | 0.5930 | -0.2713 | -0.6200 | 0.7262 | 0.3487 | 1.2365 | -0.5595 | 0.6090 | -324.1454 | -311.6203 | -2.6815 | -2.7140 |
|
100 |
+
| 0.5841 | 0.86 | 3300 | 0.5927 | -0.2686 | -0.6177 | 0.7302 | 0.3491 | 1.2362 | -0.5602 | 0.6093 | -323.9175 | -311.3521 | -2.6834 | -2.7157 |
|
101 |
+
| 0.611 | 0.89 | 3400 | 0.5925 | -0.2485 | -0.5979 | 0.7361 | 0.3493 | 1.2281 | -0.5496 | 0.6023 | -321.9356 | -309.3477 | -2.6821 | -2.7145 |
|
102 |
+
| 0.5458 | 0.92 | 3500 | 0.5925 | -0.2494 | -0.5988 | 0.7341 | 0.3494 | 1.2280 | -0.5516 | 0.6025 | -322.0256 | -309.4359 | -2.6792 | -2.7118 |
|
103 |
+
| 0.5926 | 0.94 | 3600 | 0.5925 | -0.2520 | -0.6014 | 0.7321 | 0.3494 | 1.2312 | -0.5539 | 0.6042 | -322.2860 | -309.6909 | -2.6837 | -2.7160 |
|
104 |
+
| 0.6096 | 0.97 | 3700 | 0.5926 | -0.2517 | -0.6015 | 0.7341 | 0.3497 | 1.2313 | -0.5539 | 0.6042 | -322.2966 | -309.6683 | -2.6793 | -2.7119 |
|
105 |
+
| 0.5865 | 0.99 | 3800 | 0.5925 | -0.2517 | -0.6019 | 0.7341 | 0.3502 | 1.2316 | -0.5546 | 0.6038 | -322.3433 | -309.6684 | -2.6801 | -2.7126 |
|
106 |
+
|
107 |
+
|
108 |
+
### Framework versions
|
109 |
+
|
110 |
+
- PEFT 0.7.1
|
111 |
+
- Transformers 4.39.0.dev0
|
112 |
+
- Pytorch 2.1.2+cu121
|
113 |
+
- Datasets 2.14.6
|
114 |
+
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 671150064
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:607b501fcfaf72a6aa3455caf733453d4d58a4f3ab30b8a8e971b9ec9b0af0c1
|
3 |
size 671150064
|
all_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.619671463092813,
|
4 |
+
"train_runtime": 44477.573,
|
5 |
+
"train_samples": 61134,
|
6 |
+
"train_samples_per_second": 1.374,
|
7 |
+
"train_steps_per_second": 0.086
|
8 |
+
}
|
runs/Jul22_23-14-44_node26/events.out.tfevents.1721657973.node26.3249241.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0ab491a6461a6c2d91253b2087e7642b658580b098451f3f0607f33828d18ac2
|
3 |
+
size 377805
|
train_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.619671463092813,
|
4 |
+
"train_runtime": 44477.573,
|
5 |
+
"train_samples": 61134,
|
6 |
+
"train_samples_per_second": 1.374,
|
7 |
+
"train_steps_per_second": 0.086
|
8 |
+
}
|
trainer_state.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|