just1nseo commited on
Commit
cf35523
·
verified ·
1 Parent(s): cfc5ecb

Model save

Browse files
README.md ADDED
@@ -0,0 +1,117 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: alignment-handbook/zephyr-7b-sft-full
3
+ library_name: peft
4
+ license: apache-2.0
5
+ tags:
6
+ - trl
7
+ - dpo
8
+ - generated_from_trainer
9
+ model-index:
10
+ - name: zephyr-dpop-qlora-uf-5e-7
11
+ results: []
12
+ ---
13
+
14
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
15
+ should probably proofread and complete it, then remove this comment. -->
16
+
17
+ # zephyr-dpop-qlora-uf-5e-7
18
+
19
+ This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.6789
22
+ - Positive Losses: 0.2612
23
+ - Dpo Losses: 0.6395
24
+ - Rewards/chosen: 0.2320
25
+ - Rewards/rejected: 0.1098
26
+ - Rewards/accuracies: 0.7220
27
+ - Rewards/margins: 0.1222
28
+ - Rewards/margins Max: 0.4482
29
+ - Rewards/margins Min: -0.1549
30
+ - Rewards/margins Std: 0.2025
31
+ - Logps/rejected: -247.6030
32
+ - Logps/chosen: -261.3938
33
+ - Logits/rejected: -2.6204
34
+ - Logits/chosen: -2.6544
35
+
36
+ ## Model description
37
+
38
+ More information needed
39
+
40
+ ## Intended uses & limitations
41
+
42
+ More information needed
43
+
44
+ ## Training and evaluation data
45
+
46
+ More information needed
47
+
48
+ ## Training procedure
49
+
50
+ ### Training hyperparameters
51
+
52
+ The following hyperparameters were used during training:
53
+ - learning_rate: 5e-06
54
+ - train_batch_size: 4
55
+ - eval_batch_size: 8
56
+ - seed: 42
57
+ - distributed_type: multi-GPU
58
+ - num_devices: 2
59
+ - gradient_accumulation_steps: 2
60
+ - total_train_batch_size: 16
61
+ - total_eval_batch_size: 16
62
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
63
+ - lr_scheduler_type: cosine
64
+ - lr_scheduler_warmup_ratio: 0.1
65
+ - num_epochs: 1
66
+
67
+ ### Training results
68
+
69
+ | Training Loss | Epoch | Step | Validation Loss | Positive Losses | Dpo Losses | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
70
+ |:-------------:|:-----:|:----:|:---------------:|:---------------:|:----------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
71
+ | 0.6911 | 0.03 | 100 | 0.6918 | 0.0097 | 0.6901 | 0.0256 | 0.0195 | 0.6670 | 0.0061 | 0.0282 | -0.0136 | 0.0138 | -256.6321 | -282.0331 | -2.7662 | -2.8054 |
72
+ | 0.6847 | 0.05 | 200 | 0.6919 | 0.0310 | 0.6806 | 0.0844 | 0.0583 | 0.6710 | 0.0261 | 0.1132 | -0.0512 | 0.0542 | -252.7455 | -276.1540 | -2.7592 | -2.7987 |
73
+ | 0.686 | 0.08 | 300 | 0.6901 | 0.0841 | 0.6693 | 0.1465 | 0.0956 | 0.6950 | 0.0509 | 0.2071 | -0.0915 | 0.0989 | -249.0196 | -269.9467 | -2.7474 | -2.7859 |
74
+ | 0.6944 | 0.1 | 400 | 0.6911 | 0.1510 | 0.6631 | 0.1581 | 0.0931 | 0.7100 | 0.0650 | 0.2490 | -0.1113 | 0.1195 | -249.2730 | -268.7827 | -2.7115 | -2.7504 |
75
+ | 0.6923 | 0.13 | 500 | 0.6788 | 0.0596 | 0.6647 | 0.1948 | 0.1332 | 0.6950 | 0.0617 | 0.2513 | -0.1077 | 0.1190 | -245.2602 | -265.1090 | -2.6843 | -2.7243 |
76
+ | 0.663 | 0.16 | 600 | 0.6892 | 0.1483 | 0.6607 | 0.1942 | 0.1226 | 0.6770 | 0.0716 | 0.3008 | -0.1286 | 0.1420 | -246.3230 | -265.1740 | -2.6660 | -2.7036 |
77
+ | 0.6784 | 0.18 | 700 | 0.6935 | 0.2142 | 0.6550 | 0.1892 | 0.1049 | 0.6970 | 0.0843 | 0.3275 | -0.1274 | 0.1516 | -248.0892 | -265.6756 | -2.6229 | -2.6624 |
78
+ | 0.661 | 0.21 | 800 | 0.6885 | 0.1770 | 0.6538 | 0.1994 | 0.1122 | 0.7020 | 0.0872 | 0.3388 | -0.1292 | 0.1549 | -247.3548 | -264.6508 | -2.6850 | -2.7245 |
79
+ | 0.6736 | 0.24 | 900 | 0.6827 | 0.1576 | 0.6557 | 0.2025 | 0.1192 | 0.6940 | 0.0833 | 0.3345 | -0.1335 | 0.1561 | -246.6593 | -264.3388 | -2.6814 | -2.7201 |
80
+ | 0.6998 | 0.26 | 1000 | 0.6806 | 0.2131 | 0.6517 | 0.2037 | 0.1115 | 0.7070 | 0.0922 | 0.3499 | -0.1335 | 0.1615 | -247.4245 | -264.2192 | -2.6830 | -2.7190 |
81
+ | 0.6943 | 0.29 | 1100 | 0.6808 | 0.2125 | 0.6503 | 0.2101 | 0.1144 | 0.7100 | 0.0957 | 0.3629 | -0.1371 | 0.1674 | -247.1344 | -263.5789 | -2.6633 | -2.6979 |
82
+ | 0.6761 | 0.31 | 1200 | 0.6793 | 0.1898 | 0.6511 | 0.2157 | 0.1215 | 0.7110 | 0.0942 | 0.3704 | -0.1366 | 0.1692 | -246.4255 | -263.0201 | -2.6573 | -2.6916 |
83
+ | 0.6976 | 0.34 | 1300 | 0.6730 | 0.1194 | 0.6535 | 0.2178 | 0.1297 | 0.7080 | 0.0881 | 0.3434 | -0.1322 | 0.1594 | -245.6055 | -262.8122 | -2.6282 | -2.6641 |
84
+ | 0.7536 | 0.37 | 1400 | 0.7005 | 0.3143 | 0.6471 | 0.2121 | 0.1083 | 0.7030 | 0.1038 | 0.3986 | -0.1530 | 0.1838 | -247.7509 | -263.3833 | -2.6211 | -2.6572 |
85
+ | 0.6711 | 0.39 | 1500 | 0.6918 | 0.2213 | 0.6489 | 0.2190 | 0.1197 | 0.7040 | 0.0994 | 0.3826 | -0.1451 | 0.1760 | -246.6128 | -262.6917 | -2.5983 | -2.6356 |
86
+ | 0.7428 | 0.42 | 1600 | 0.6867 | 0.1652 | 0.6501 | 0.2193 | 0.1228 | 0.7010 | 0.0965 | 0.3730 | -0.1448 | 0.1730 | -246.2957 | -262.6611 | -2.5979 | -2.6328 |
87
+ | 0.6593 | 0.44 | 1700 | 0.6785 | 0.2228 | 0.6467 | 0.2221 | 0.1173 | 0.7110 | 0.1048 | 0.3978 | -0.1483 | 0.1825 | -246.8526 | -262.3859 | -2.6262 | -2.6614 |
88
+ | 0.6856 | 0.47 | 1800 | 0.6702 | 0.1343 | 0.6504 | 0.2318 | 0.1356 | 0.6980 | 0.0962 | 0.3760 | -0.1454 | 0.1748 | -245.0162 | -261.4142 | -2.5972 | -2.6326 |
89
+ | 0.6552 | 0.5 | 1900 | 0.6743 | 0.1855 | 0.6484 | 0.2278 | 0.1267 | 0.6990 | 0.1011 | 0.3920 | -0.1494 | 0.1816 | -245.9063 | -261.8096 | -2.5761 | -2.6118 |
90
+ | 0.6577 | 0.52 | 2000 | 0.6748 | 0.2036 | 0.6461 | 0.2310 | 0.1247 | 0.7090 | 0.1063 | 0.4016 | -0.1526 | 0.1853 | -246.1064 | -261.4890 | -2.5869 | -2.6241 |
91
+ | 0.6695 | 0.55 | 2100 | 0.6841 | 0.2842 | 0.6443 | 0.2230 | 0.1124 | 0.7100 | 0.1106 | 0.4202 | -0.1537 | 0.1915 | -247.3420 | -262.2980 | -2.6033 | -2.6404 |
92
+ | 0.6633 | 0.58 | 2200 | 0.6799 | 0.2580 | 0.6435 | 0.2273 | 0.1147 | 0.7140 | 0.1126 | 0.4254 | -0.1549 | 0.1932 | -247.1040 | -261.8589 | -2.6014 | -2.6383 |
93
+ | 0.7136 | 0.6 | 2300 | 0.6781 | 0.2376 | 0.6443 | 0.2290 | 0.1183 | 0.7110 | 0.1107 | 0.4197 | -0.1532 | 0.1914 | -246.7446 | -261.6907 | -2.6118 | -2.6471 |
94
+ | 0.6631 | 0.63 | 2400 | 0.6769 | 0.2289 | 0.6450 | 0.2285 | 0.1195 | 0.7080 | 0.1090 | 0.4134 | -0.1509 | 0.1882 | -246.6301 | -261.7479 | -2.6072 | -2.6430 |
95
+ | 0.6884 | 0.65 | 2500 | 0.6854 | 0.3215 | 0.6404 | 0.2248 | 0.1047 | 0.7120 | 0.1201 | 0.4408 | -0.1583 | 0.2000 | -248.1103 | -262.1167 | -2.6064 | -2.6413 |
96
+ | 0.6701 | 0.68 | 2600 | 0.6817 | 0.2661 | 0.6432 | 0.2290 | 0.1154 | 0.7240 | 0.1136 | 0.4344 | -0.1554 | 0.1960 | -247.0384 | -261.6952 | -2.6116 | -2.6458 |
97
+ | 0.668 | 0.71 | 2700 | 0.6771 | 0.2209 | 0.6441 | 0.2330 | 0.1218 | 0.7190 | 0.1112 | 0.4213 | -0.1525 | 0.1911 | -246.4004 | -261.2966 | -2.6196 | -2.6533 |
98
+ | 0.6851 | 0.73 | 2800 | 0.6777 | 0.2299 | 0.6430 | 0.2330 | 0.1192 | 0.7090 | 0.1138 | 0.4274 | -0.1550 | 0.1946 | -246.6621 | -261.2937 | -2.6278 | -2.6613 |
99
+ | 0.678 | 0.76 | 2900 | 0.6856 | 0.2997 | 0.6402 | 0.2278 | 0.1072 | 0.7110 | 0.1207 | 0.4462 | -0.1603 | 0.2028 | -247.8615 | -261.8085 | -2.6269 | -2.6602 |
100
+ | 0.6605 | 0.79 | 3000 | 0.6807 | 0.2415 | 0.6412 | 0.2316 | 0.1134 | 0.7160 | 0.1182 | 0.4380 | -0.1547 | 0.1986 | -247.2367 | -261.4324 | -2.6275 | -2.6605 |
101
+ | 0.6874 | 0.81 | 3100 | 0.6753 | 0.2061 | 0.6425 | 0.2349 | 0.1199 | 0.7190 | 0.1150 | 0.4300 | -0.1520 | 0.1951 | -246.5852 | -261.0995 | -2.6151 | -2.6494 |
102
+ | 0.6516 | 0.84 | 3200 | 0.6828 | 0.3006 | 0.6385 | 0.2284 | 0.1036 | 0.7160 | 0.1248 | 0.4527 | -0.1586 | 0.2052 | -248.2176 | -261.7539 | -2.6158 | -2.6498 |
103
+ | 0.6627 | 0.86 | 3300 | 0.6773 | 0.2406 | 0.6403 | 0.2325 | 0.1123 | 0.7190 | 0.1203 | 0.4419 | -0.1545 | 0.2003 | -247.3520 | -261.3398 | -2.6184 | -2.6526 |
104
+ | 0.6517 | 0.89 | 3400 | 0.6814 | 0.2865 | 0.6386 | 0.2300 | 0.1056 | 0.7190 | 0.1244 | 0.4519 | -0.1569 | 0.2045 | -248.0181 | -261.5968 | -2.6213 | -2.6551 |
105
+ | 0.7267 | 0.92 | 3500 | 0.6810 | 0.2880 | 0.6385 | 0.2302 | 0.1056 | 0.7200 | 0.1246 | 0.4536 | -0.1569 | 0.2050 | -248.0208 | -261.5744 | -2.6222 | -2.6560 |
106
+ | 0.6563 | 0.94 | 3600 | 0.6790 | 0.2627 | 0.6394 | 0.2318 | 0.1093 | 0.7220 | 0.1225 | 0.4487 | -0.1550 | 0.2027 | -247.6491 | -261.4136 | -2.6216 | -2.6555 |
107
+ | 0.7039 | 0.97 | 3700 | 0.6790 | 0.2634 | 0.6396 | 0.2320 | 0.1099 | 0.7230 | 0.1222 | 0.4483 | -0.1550 | 0.2025 | -247.5927 | -261.3918 | -2.6220 | -2.6559 |
108
+ | 0.6622 | 0.99 | 3800 | 0.6789 | 0.2612 | 0.6395 | 0.2320 | 0.1098 | 0.7220 | 0.1222 | 0.4482 | -0.1549 | 0.2025 | -247.6030 | -261.3938 | -2.6204 | -2.6544 |
109
+
110
+
111
+ ### Framework versions
112
+
113
+ - PEFT 0.7.1
114
+ - Transformers 4.39.0.dev0
115
+ - Pytorch 2.1.2+cu121
116
+ - Datasets 2.14.6
117
+ - Tokenizers 0.15.2
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:5b9df00c16b8ce3788645baa20d5832d529d941ee10833eda2d98338a4dee535
3
  size 671150064
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:35dd0a2cb8300775ea20bf46664246ce1114f721ebb682f16d8faa982734aa9e
3
  size 671150064
all_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "train_loss": 0.6795008113577053,
4
+ "train_runtime": 46141.1523,
5
+ "train_samples": 61134,
6
+ "train_samples_per_second": 1.325,
7
+ "train_steps_per_second": 0.083
8
+ }
runs/Jul14_16-51-33_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1720976649.notebook-deployment-48-7d9b6c99-p5kv4.34769.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:edc1d474675c4b1c1bd77633dfb7696e1736859d925cbec35ae66710ea8f32a4
3
- size 423792
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4f89c53e22282e2a2827a0d90885352830744fd149ebae2c8fdd0c21ba9eb3ed
3
+ size 426136
train_results.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "epoch": 1.0,
3
+ "train_loss": 0.6795008113577053,
4
+ "train_runtime": 46141.1523,
5
+ "train_samples": 61134,
6
+ "train_samples_per_second": 1.325,
7
+ "train_steps_per_second": 0.083
8
+ }
trainer_state.json ADDED
The diff for this file is too large to render. See raw diff