Model save
Browse files
README.md
ADDED
@@ -0,0 +1,120 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
base_model: alignment-handbook/zephyr-7b-sft-full
|
3 |
+
library_name: peft
|
4 |
+
license: apache-2.0
|
5 |
+
tags:
|
6 |
+
- trl
|
7 |
+
- dpo
|
8 |
+
- generated_from_trainer
|
9 |
+
model-index:
|
10 |
+
- name: zephyr-dpop-qlora-uf-ours-uffull-5e-7
|
11 |
+
results: []
|
12 |
+
---
|
13 |
+
|
14 |
+
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
15 |
+
should probably proofread and complete it, then remove this comment. -->
|
16 |
+
|
17 |
+
# zephyr-dpop-qlora-uf-ours-uffull-5e-7
|
18 |
+
|
19 |
+
This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
|
20 |
+
It achieves the following results on the evaluation set:
|
21 |
+
- Loss: 0.6825
|
22 |
+
- Positive Losses: 0.1480
|
23 |
+
- Dpo Losses: 0.6646
|
24 |
+
- Rewards/chosen: 0.1662
|
25 |
+
- Rewards/rejected: 0.1036
|
26 |
+
- Rewards/accuracies: 0.6810
|
27 |
+
- Rewards/margins: 0.0626
|
28 |
+
- Rewards/margins Max: 0.2720
|
29 |
+
- Rewards/margins Min: -0.1174
|
30 |
+
- Rewards/margins Std: 0.1305
|
31 |
+
- Logps/rejected: -255.4913
|
32 |
+
- Logps/chosen: -267.8348
|
33 |
+
- Logits/rejected: -2.7189
|
34 |
+
- Logits/chosen: -2.7542
|
35 |
+
|
36 |
+
## Model description
|
37 |
+
|
38 |
+
More information needed
|
39 |
+
|
40 |
+
## Intended uses & limitations
|
41 |
+
|
42 |
+
More information needed
|
43 |
+
|
44 |
+
## Training and evaluation data
|
45 |
+
|
46 |
+
More information needed
|
47 |
+
|
48 |
+
## Training procedure
|
49 |
+
|
50 |
+
### Training hyperparameters
|
51 |
+
|
52 |
+
The following hyperparameters were used during training:
|
53 |
+
- learning_rate: 5e-07
|
54 |
+
- train_batch_size: 4
|
55 |
+
- eval_batch_size: 8
|
56 |
+
- seed: 42
|
57 |
+
- distributed_type: multi-GPU
|
58 |
+
- num_devices: 2
|
59 |
+
- gradient_accumulation_steps: 2
|
60 |
+
- total_train_batch_size: 16
|
61 |
+
- total_eval_batch_size: 16
|
62 |
+
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
63 |
+
- lr_scheduler_type: cosine
|
64 |
+
- lr_scheduler_warmup_ratio: 0.1
|
65 |
+
- num_epochs: 1
|
66 |
+
|
67 |
+
### Training results
|
68 |
+
|
69 |
+
| Training Loss | Epoch | Step | Validation Loss | Positive Losses | Dpo Losses | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
|
70 |
+
|:-------------:|:-----:|:----:|:---------------:|:---------------:|:----------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
|
71 |
+
| 0.694 | 0.02 | 100 | 0.6937 | 0.0064 | 0.6931 | 0.0049 | 0.0049 | 0.5075 | 0.0001 | 0.0049 | -0.0046 | 0.0032 | -265.3661 | -283.9625 | -2.7648 | -2.8001 |
|
72 |
+
| 0.6922 | 0.05 | 200 | 0.6930 | 0.0035 | 0.6926 | 0.0082 | 0.0071 | 0.5875 | 0.0011 | 0.0082 | -0.0056 | 0.0046 | -265.1425 | -283.6357 | -2.7650 | -2.8002 |
|
73 |
+
| 0.692 | 0.07 | 300 | 0.6921 | 0.0052 | 0.6914 | 0.0190 | 0.0154 | 0.6175 | 0.0035 | 0.0195 | -0.0103 | 0.0099 | -264.3096 | -282.5598 | -2.7662 | -2.8012 |
|
74 |
+
| 0.6914 | 0.1 | 400 | 0.6907 | 0.0081 | 0.6896 | 0.0324 | 0.0252 | 0.6435 | 0.0072 | 0.0364 | -0.0176 | 0.0181 | -263.3349 | -281.2179 | -2.7620 | -2.7972 |
|
75 |
+
| 0.6867 | 0.12 | 500 | 0.6887 | 0.0124 | 0.6868 | 0.0581 | 0.0451 | 0.6360 | 0.0130 | 0.0654 | -0.0313 | 0.0323 | -261.3455 | -278.6435 | -2.7580 | -2.7932 |
|
76 |
+
| 0.6903 | 0.14 | 600 | 0.6869 | 0.0213 | 0.6837 | 0.0696 | 0.0499 | 0.6565 | 0.0197 | 0.0949 | -0.0434 | 0.0461 | -260.8595 | -277.4952 | -2.7576 | -2.7926 |
|
77 |
+
| 0.6828 | 0.17 | 700 | 0.6855 | 0.0302 | 0.6813 | 0.0840 | 0.0592 | 0.6595 | 0.0248 | 0.1199 | -0.0539 | 0.0580 | -259.9324 | -276.0511 | -2.7490 | -2.7843 |
|
78 |
+
| 0.6758 | 0.19 | 800 | 0.6855 | 0.0526 | 0.6791 | 0.0969 | 0.0672 | 0.6550 | 0.0297 | 0.1423 | -0.0640 | 0.0688 | -259.1296 | -274.7613 | -2.7450 | -2.7804 |
|
79 |
+
| 0.6811 | 0.22 | 900 | 0.6854 | 0.0594 | 0.6771 | 0.1064 | 0.0725 | 0.6645 | 0.0339 | 0.1596 | -0.0715 | 0.0771 | -258.6040 | -273.8141 | -2.7378 | -2.7726 |
|
80 |
+
| 0.6803 | 0.24 | 1000 | 0.6845 | 0.0609 | 0.6762 | 0.1167 | 0.0807 | 0.6645 | 0.0360 | 0.1687 | -0.0763 | 0.0818 | -257.7856 | -272.7885 | -2.7285 | -2.7634 |
|
81 |
+
| 0.6759 | 0.26 | 1100 | 0.6842 | 0.0676 | 0.6750 | 0.1250 | 0.0862 | 0.6610 | 0.0388 | 0.1815 | -0.0829 | 0.0881 | -257.2345 | -271.9526 | -2.7320 | -2.7672 |
|
82 |
+
| 0.6732 | 0.29 | 1200 | 0.6896 | 0.1405 | 0.6722 | 0.1179 | 0.0727 | 0.6695 | 0.0452 | 0.2076 | -0.0939 | 0.1005 | -258.5845 | -272.6641 | -2.7315 | -2.7664 |
|
83 |
+
| 0.6748 | 0.31 | 1300 | 0.6835 | 0.0876 | 0.6734 | 0.1391 | 0.0966 | 0.6665 | 0.0425 | 0.1965 | -0.0897 | 0.0954 | -256.1944 | -270.5492 | -2.7357 | -2.7709 |
|
84 |
+
| 0.6872 | 0.34 | 1400 | 0.6834 | 0.0973 | 0.6721 | 0.1392 | 0.0939 | 0.6670 | 0.0453 | 0.2070 | -0.0930 | 0.1000 | -256.4647 | -270.5385 | -2.7367 | -2.7719 |
|
85 |
+
| 0.6926 | 0.36 | 1500 | 0.6833 | 0.1058 | 0.6710 | 0.1402 | 0.0925 | 0.6685 | 0.0477 | 0.2165 | -0.0956 | 0.1042 | -256.6026 | -270.4324 | -2.7329 | -2.7681 |
|
86 |
+
| 0.6862 | 0.38 | 1600 | 0.6891 | 0.1729 | 0.6689 | 0.1322 | 0.0796 | 0.6750 | 0.0526 | 0.2361 | -0.1039 | 0.1134 | -257.8935 | -271.2309 | -2.7292 | -2.7642 |
|
87 |
+
| 0.6779 | 0.41 | 1700 | 0.6821 | 0.0962 | 0.6698 | 0.1486 | 0.0979 | 0.6705 | 0.0507 | 0.2293 | -0.1016 | 0.1104 | -256.0604 | -269.5961 | -2.7308 | -2.7658 |
|
88 |
+
| 0.6726 | 0.43 | 1800 | 0.6842 | 0.1209 | 0.6687 | 0.1467 | 0.0934 | 0.6730 | 0.0533 | 0.2380 | -0.1060 | 0.1149 | -256.5087 | -269.7857 | -2.7266 | -2.7615 |
|
89 |
+
| 0.6688 | 0.45 | 1900 | 0.6834 | 0.1202 | 0.6681 | 0.1483 | 0.0938 | 0.6745 | 0.0545 | 0.2410 | -0.1065 | 0.1162 | -256.4724 | -269.6281 | -2.7300 | -2.7651 |
|
90 |
+
| 0.6616 | 0.48 | 2000 | 0.6818 | 0.1092 | 0.6681 | 0.1532 | 0.0987 | 0.6720 | 0.0545 | 0.2409 | -0.1069 | 0.1164 | -255.9825 | -269.1367 | -2.7336 | -2.7687 |
|
91 |
+
| 0.6707 | 0.5 | 2100 | 0.6804 | 0.0930 | 0.6684 | 0.1588 | 0.1049 | 0.6710 | 0.0538 | 0.2405 | -0.1069 | 0.1162 | -255.3586 | -268.5765 | -2.7300 | -2.7651 |
|
92 |
+
| 0.6796 | 0.53 | 2200 | 0.6849 | 0.1551 | 0.6666 | 0.1500 | 0.0920 | 0.6755 | 0.0580 | 0.2565 | -0.1121 | 0.1234 | -256.6537 | -269.4551 | -2.7228 | -2.7582 |
|
93 |
+
| 0.6672 | 0.55 | 2300 | 0.6830 | 0.1404 | 0.6668 | 0.1562 | 0.0986 | 0.6725 | 0.0576 | 0.2557 | -0.1114 | 0.1231 | -255.9975 | -268.8366 | -2.7203 | -2.7554 |
|
94 |
+
| 0.6769 | 0.57 | 2400 | 0.6819 | 0.1252 | 0.6668 | 0.1596 | 0.1019 | 0.6740 | 0.0577 | 0.2565 | -0.1128 | 0.1238 | -255.6599 | -268.4941 | -2.7159 | -2.7508 |
|
95 |
+
| 0.6725 | 0.6 | 2500 | 0.6903 | 0.2239 | 0.6645 | 0.1488 | 0.0859 | 0.6850 | 0.0630 | 0.2751 | -0.1201 | 0.1325 | -257.2663 | -269.5727 | -2.7161 | -2.7509 |
|
96 |
+
| 0.6762 | 0.62 | 2600 | 0.6834 | 0.1472 | 0.6655 | 0.1615 | 0.1008 | 0.6760 | 0.0606 | 0.2671 | -0.1166 | 0.1287 | -255.7709 | -268.3081 | -2.7154 | -2.7503 |
|
97 |
+
| 0.6867 | 0.65 | 2700 | 0.6846 | 0.1619 | 0.6649 | 0.1605 | 0.0985 | 0.6820 | 0.0620 | 0.2708 | -0.1178 | 0.1304 | -256.0078 | -268.4086 | -2.7205 | -2.7554 |
|
98 |
+
| 0.702 | 0.67 | 2800 | 0.6836 | 0.1510 | 0.6651 | 0.1623 | 0.1007 | 0.6815 | 0.0616 | 0.2697 | -0.1175 | 0.1299 | -255.7832 | -268.2218 | -2.7157 | -2.7510 |
|
99 |
+
| 0.6822 | 0.69 | 2900 | 0.6818 | 0.1312 | 0.6653 | 0.1655 | 0.1045 | 0.6800 | 0.0610 | 0.2669 | -0.1156 | 0.1282 | -255.4075 | -267.9095 | -2.7201 | -2.7554 |
|
100 |
+
| 0.6751 | 0.72 | 3000 | 0.6809 | 0.1235 | 0.6656 | 0.1674 | 0.1070 | 0.6745 | 0.0604 | 0.2651 | -0.1144 | 0.1272 | -255.1547 | -267.7156 | -2.7193 | -2.7547 |
|
101 |
+
| 0.673 | 0.74 | 3100 | 0.6830 | 0.1523 | 0.6648 | 0.1643 | 0.1022 | 0.6815 | 0.0621 | 0.2709 | -0.1168 | 0.1301 | -255.6314 | -268.0210 | -2.7211 | -2.7563 |
|
102 |
+
| 0.6666 | 0.77 | 3200 | 0.6818 | 0.1381 | 0.6653 | 0.1672 | 0.1062 | 0.6785 | 0.0611 | 0.2675 | -0.1157 | 0.1284 | -255.2344 | -267.7304 | -2.7202 | -2.7554 |
|
103 |
+
| 0.6619 | 0.79 | 3300 | 0.6829 | 0.1523 | 0.6647 | 0.1652 | 0.1028 | 0.6810 | 0.0624 | 0.2717 | -0.1172 | 0.1304 | -255.5768 | -267.9396 | -2.7207 | -2.7559 |
|
104 |
+
| 0.6752 | 0.81 | 3400 | 0.6830 | 0.1530 | 0.6647 | 0.1653 | 0.1029 | 0.6805 | 0.0625 | 0.2718 | -0.1177 | 0.1306 | -255.5670 | -267.9222 | -2.7197 | -2.7548 |
|
105 |
+
| 0.6711 | 0.84 | 3500 | 0.6841 | 0.1663 | 0.6643 | 0.1634 | 0.1000 | 0.6795 | 0.0633 | 0.2740 | -0.1183 | 0.1317 | -255.8493 | -268.1196 | -2.7188 | -2.7540 |
|
106 |
+
| 0.669 | 0.86 | 3600 | 0.6843 | 0.1689 | 0.6642 | 0.1628 | 0.0992 | 0.6815 | 0.0637 | 0.2755 | -0.1190 | 0.1323 | -255.9366 | -268.1706 | -2.7180 | -2.7533 |
|
107 |
+
| 0.6563 | 0.89 | 3700 | 0.6835 | 0.1602 | 0.6643 | 0.1642 | 0.1009 | 0.6815 | 0.0633 | 0.2740 | -0.1182 | 0.1316 | -255.7627 | -268.0358 | -2.7189 | -2.7540 |
|
108 |
+
| 0.6811 | 0.91 | 3800 | 0.6828 | 0.1517 | 0.6646 | 0.1658 | 0.1032 | 0.6820 | 0.0627 | 0.2721 | -0.1176 | 0.1307 | -255.5359 | -267.8722 | -2.7190 | -2.7541 |
|
109 |
+
| 0.664 | 0.93 | 3900 | 0.6823 | 0.1453 | 0.6647 | 0.1664 | 0.1039 | 0.6780 | 0.0625 | 0.2717 | -0.1171 | 0.1305 | -255.4641 | -267.8119 | -2.7221 | -2.7571 |
|
110 |
+
| 0.6771 | 0.96 | 4000 | 0.6824 | 0.1453 | 0.6647 | 0.1662 | 0.1037 | 0.6775 | 0.0625 | 0.2716 | -0.1174 | 0.1304 | -255.4852 | -267.8388 | -2.7216 | -2.7566 |
|
111 |
+
| 0.6644 | 0.98 | 4100 | 0.6825 | 0.1480 | 0.6646 | 0.1662 | 0.1036 | 0.6810 | 0.0626 | 0.2720 | -0.1174 | 0.1305 | -255.4913 | -267.8348 | -2.7189 | -2.7542 |
|
112 |
+
|
113 |
+
|
114 |
+
### Framework versions
|
115 |
+
|
116 |
+
- PEFT 0.7.1
|
117 |
+
- Transformers 4.39.0.dev0
|
118 |
+
- Pytorch 2.1.2+cu121
|
119 |
+
- Datasets 2.14.6
|
120 |
+
- Tokenizers 0.15.2
|
adapter_model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 671150064
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:0371fbb85b053648e8e0b36a6104b5f3b86e6c8e5913bb3e7c0dff078989bde3
|
3 |
size 671150064
|
all_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.6776896565581647,
|
4 |
+
"train_runtime": 67917.559,
|
5 |
+
"train_samples": 66812,
|
6 |
+
"train_samples_per_second": 0.984,
|
7 |
+
"train_steps_per_second": 0.061
|
8 |
+
}
|
runs/Jul16_03-11-50_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100015.notebook-deployment-48-7d9b6c99-p5kv4.69846.0
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
-
size
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:7e8a568606180f4e015aaf931bdc62cb23d4f4c17f3aa08d339089b70cee5b14
|
3 |
+
size 464128
|
train_results.json
ADDED
@@ -0,0 +1,8 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
{
|
2 |
+
"epoch": 1.0,
|
3 |
+
"train_loss": 0.6776896565581647,
|
4 |
+
"train_runtime": 67917.559,
|
5 |
+
"train_samples": 66812,
|
6 |
+
"train_samples_per_second": 0.984,
|
7 |
+
"train_steps_per_second": 0.061
|
8 |
+
}
|
trainer_state.json
ADDED
The diff for this file is too large to render.
See raw diff
|
|