Model save

Browse files

Files changed (6) hide show

README.md +118 -0
adapter_model.safetensors +1 -1
all_results.json +8 -0
runs/Jul16_03-11-51_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100016.notebook-deployment-48-7d9b6c99-p5kv4.69914.0 +2 -2
train_results.json +8 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,118 @@

+---
+base_model: alignment-handbook/zephyr-7b-sft-full
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-dpo-qlora-uf-ours-uffull-5e-7
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-dpo-qlora-uf-ours-uffull-5e-7
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5925
+- Rewards/chosen: -0.2025
+- Rewards/rejected: -0.5175
+- Rewards/accuracies: 0.7060
+- Rewards/margins: 0.3149
+- Rewards/margins Max: 1.1833
+- Rewards/margins Min: -0.5415
+- Rewards/margins Std: 0.5824
+- Logps/rejected: -317.5993
+- Logps/chosen: -304.7070
+- Logits/rejected: -2.5752
+- Logits/chosen: -2.6056
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6931        | 0.02  | 100  | 0.6930          | 0.0002         | -0.0000          | 0.5130             | 0.0002          | 0.0047              | -0.0043             | 0.0030              | -265.8543      | -284.4388    | -2.7682         | -2.8031       |
+| 0.692         | 0.05  | 200  | 0.6923          | 0.0017         | 0.0000           | 0.6220             | 0.0017          | 0.0099              | -0.0057             | 0.0051              | -265.8525      | -284.2892    | -2.7668         | -2.8017       |
+| 0.6903        | 0.07  | 300  | 0.6908          | 0.0067         | 0.0019           | 0.6520             | 0.0048          | 0.0253              | -0.0125             | 0.0125              | -265.6623      | -283.7856    | -2.7627         | -2.7978       |
+| 0.6888        | 0.1   | 400  | 0.6880          | 0.0104         | -0.0004          | 0.6645             | 0.0108          | 0.0545              | -0.0264             | 0.0268              | -265.8943      | -283.4167    | -2.7573         | -2.7924       |
+| 0.6827        | 0.12  | 500  | 0.6834          | 0.0345         | 0.0138           | 0.6820             | 0.0207          | 0.0989              | -0.0454             | 0.0479              | -264.4715      | -281.0052    | -2.7529         | -2.7877       |
+| 0.6831        | 0.14  | 600  | 0.6776          | 0.0296         | -0.0039          | 0.6910             | 0.0335          | 0.1552              | -0.0696             | 0.0745              | -266.2422      | -281.4937    | -2.7479         | -2.7827       |
+| 0.6652        | 0.17  | 700  | 0.6700          | 0.0086         | -0.0427          | 0.6820             | 0.0513          | 0.2350              | -0.1057             | 0.1128              | -270.1202      | -283.5948    | -2.7382         | -2.7726       |
+| 0.6486        | 0.19  | 800  | 0.6615          | -0.0198        | -0.0921          | 0.6805             | 0.0723          | 0.3237              | -0.1470             | 0.1565              | -275.0622      | -286.4378    | -2.7367         | -2.7702       |
+| 0.6457        | 0.22  | 900  | 0.6531          | -0.0599        | -0.1549          | 0.6755             | 0.0950          | 0.4216              | -0.1947             | 0.2059              | -281.3418      | -290.4436    | -2.7168         | -2.7500       |
+| 0.6356        | 0.24  | 1000 | 0.6449          | -0.0625        | -0.1814          | 0.6785             | 0.1188          | 0.5225              | -0.2486             | 0.2583              | -283.9890      | -290.7086    | -2.7042         | -2.7362       |
+| 0.6465        | 0.26  | 1100 | 0.6378          | -0.0291        | -0.1702          | 0.6775             | 0.1411          | 0.6108              | -0.2946             | 0.3031              | -282.8690      | -287.3659    | -2.6982         | -2.7301       |
+| 0.6121        | 0.29  | 1200 | 0.6317          | -0.0658        | -0.2261          | 0.6780             | 0.1603          | 0.6847              | -0.3354             | 0.3418              | -288.4626      | -291.0350    | -2.6893         | -2.7208       |
+| 0.6113        | 0.31  | 1300 | 0.6287          | -0.1819        | -0.3556          | 0.6820             | 0.1737          | 0.7287              | -0.3416             | 0.3621              | -301.4144      | -302.6470    | -2.6941         | -2.7251       |
+| 0.6058        | 0.34  | 1400 | 0.6234          | -0.1290        | -0.3204          | 0.6775             | 0.1914          | 0.7908              | -0.3943             | 0.3995              | -297.8902      | -297.3538    | -2.6823         | -2.7135       |
+| 0.6169        | 0.36  | 1500 | 0.6194          | -0.1244        | -0.3286          | 0.6790             | 0.2042          | 0.8341              | -0.4094             | 0.4197              | -298.7180      | -296.9003    | -2.6648         | -2.6957       |
+| 0.5809        | 0.38  | 1600 | 0.6163          | -0.1125        | -0.3291          | 0.6800             | 0.2167          | 0.8823              | -0.4243             | 0.4399              | -298.7659      | -295.7021    | -2.6547         | -2.6853       |
+| 0.5979        | 0.41  | 1700 | 0.6161          | -0.2126        | -0.4403          | 0.6805             | 0.2276          | 0.9153              | -0.4469             | 0.4624              | -309.8821      | -305.7201    | -2.6466         | -2.6773       |
+| 0.6034        | 0.43  | 1800 | 0.6124          | -0.1652        | -0.4014          | 0.6805             | 0.2362          | 0.9410              | -0.4507             | 0.4726              | -305.9889      | -300.9712    | -2.6365         | -2.6672       |
+| 0.5983        | 0.45  | 1900 | 0.6144          | -0.0531        | -0.2743          | 0.6900             | 0.2212          | 0.8923              | -0.3931             | 0.4327              | -293.2797      | -289.7628    | -2.6389         | -2.6689       |
+| 0.5822        | 0.48  | 2000 | 0.6049          | -0.1502        | -0.4096          | 0.6885             | 0.2593          | 1.0070              | -0.4697             | 0.4998              | -306.8109      | -299.4801    | -2.6378         | -2.6679       |
+| 0.6013        | 0.5   | 2100 | 0.6034          | -0.1787        | -0.4453          | 0.6870             | 0.2666          | 1.0331              | -0.4819             | 0.5137              | -310.3860      | -302.3300    | -2.6289         | -2.6593       |
+| 0.6018        | 0.53  | 2200 | 0.6019          | -0.1572        | -0.4295          | 0.6925             | 0.2723          | 1.0473              | -0.4896             | 0.5205              | -308.8055      | -300.1773    | -2.6287         | -2.6585       |
+| 0.6121        | 0.55  | 2300 | 0.6010          | -0.2434        | -0.5217          | 0.6905             | 0.2783          | 1.0633              | -0.4893             | 0.5289              | -318.0273      | -308.7991    | -2.6178         | -2.6476       |
+| 0.5698        | 0.57  | 2400 | 0.5979          | -0.1902        | -0.4780          | 0.6920             | 0.2878          | 1.0879              | -0.4939             | 0.5369              | -313.6557      | -303.4752    | -2.6092         | -2.6389       |
+| 0.5656        | 0.6   | 2500 | 0.5992          | -0.2708        | -0.5597          | 0.6985             | 0.2889          | 1.0980              | -0.5097             | 0.5454              | -321.8217      | -311.5382    | -2.5991         | -2.6291       |
+| 0.5795        | 0.62  | 2600 | 0.5950          | -0.2109        | -0.5113          | 0.6950             | 0.3003          | 1.1206              | -0.5079             | 0.5533              | -316.9805      | -305.5476    | -2.5944         | -2.6244       |
+| 0.5909        | 0.65  | 2700 | 0.5945          | -0.2006        | -0.5044          | 0.6950             | 0.3038          | 1.1335              | -0.5150             | 0.5598              | -316.2979      | -304.5152    | -2.5934         | -2.6235       |
+| 0.6097        | 0.67  | 2800 | 0.5938          | -0.2035        | -0.5091          | 0.6975             | 0.3055          | 1.1391              | -0.5171             | 0.5610              | -316.7604      | -304.8101    | -2.5909         | -2.6210       |
+| 0.5776        | 0.69  | 2900 | 0.5929          | -0.2142        | -0.5232          | 0.7040             | 0.3091          | 1.1530              | -0.5251             | 0.5673              | -318.1778      | -305.8716    | -2.5874         | -2.6177       |
+| 0.575         | 0.72  | 3000 | 0.5948          | -0.1848        | -0.4886          | 0.6980             | 0.3039          | 1.1465              | -0.5243             | 0.5647              | -314.7165      | -302.9333    | -2.5861         | -2.6165       |
+| 0.5767        | 0.74  | 3100 | 0.5936          | -0.1972        | -0.5061          | 0.7010             | 0.3089          | 1.1551              | -0.5276             | 0.5690              | -316.4648      | -304.1734    | -2.5862         | -2.6166       |
+| 0.5642        | 0.77  | 3200 | 0.5937          | -0.1943        | -0.5034          | 0.7010             | 0.3091          | 1.1615              | -0.5332             | 0.5726              | -316.1906      | -303.8846    | -2.5867         | -2.6170       |
+| 0.5767        | 0.79  | 3300 | 0.5914          | -0.2376        | -0.5569          | 0.7050             | 0.3193          | 1.1828              | -0.5330             | 0.5823              | -321.5458      | -308.2144    | -2.5828         | -2.6131       |
+| 0.5685        | 0.81  | 3400 | 0.5914          | -0.2246        | -0.5434          | 0.7045             | 0.3188          | 1.1858              | -0.5380             | 0.5834              | -320.1958      | -306.9150    | -2.5800         | -2.6103       |
+| 0.5687        | 0.84  | 3500 | 0.5909          | -0.2343        | -0.5556          | 0.7045             | 0.3214          | 1.1905              | -0.5370             | 0.5855              | -321.4169      | -307.8832    | -2.5779         | -2.6082       |
+| 0.5598        | 0.86  | 3600 | 0.5924          | -0.2063        | -0.5212          | 0.7060             | 0.3150          | 1.1819              | -0.5400             | 0.5817              | -317.9754      | -305.0805    | -2.5781         | -2.6084       |
+| 0.5639        | 0.89  | 3700 | 0.5921          | -0.2090        | -0.5258          | 0.7055             | 0.3168          | 1.1849              | -0.5399             | 0.5831              | -318.4354      | -305.3578    | -2.5751         | -2.6056       |
+| 0.5931        | 0.91  | 3800 | 0.5930          | -0.1985        | -0.5119          | 0.7060             | 0.3134          | 1.1790              | -0.5399             | 0.5802              | -317.0424      | -304.3084    | -2.5778         | -2.6081       |
+| 0.5542        | 0.93  | 3900 | 0.5929          | -0.1989        | -0.5128          | 0.7060             | 0.3139          | 1.1807              | -0.5398             | 0.5808              | -317.1321      | -304.3491    | -2.5760         | -2.6064       |
+| 0.5713        | 0.96  | 4000 | 0.5926          | -0.2022        | -0.5175          | 0.7050             | 0.3153          | 1.1831              | -0.5407             | 0.5823              | -317.6028      | -304.6741    | -2.5743         | -2.6048       |
+| 0.5725        | 0.98  | 4100 | 0.5925          | -0.2025        | -0.5175          | 0.7060             | 0.3149          | 1.1833              | -0.5415             | 0.5824              | -317.5993      | -304.7070    | -2.5752         | -2.6056       |
+### Framework versions
+- PEFT 0.7.1
+- Transformers 4.39.0.dev0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.6
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1577fedaea18596a5891baa023ddb2443d80c81b6aeea99139340cbf717ca4f4
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:f166177a6f7019483154049fe98e36b1ebf2ef9046f1ceec90f963b907b13c5e
 size 671150064

all_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.6106676421631342,
+    "train_runtime": 67977.0297,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.983,
+    "train_steps_per_second": 0.061
+}

runs/Jul16_03-11-51_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100016.notebook-deployment-48-7d9b6c99-p5kv4.69914.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:60a96f76d1d5aafc8512c7bc37d1cdc245ccc6db9af8bc6b79341b73bda7459f
-size 404929

 version https://git-lfs.github.com/spec/v1
+oid sha256:bbd428a1f0fbb200e7b7c976f68ab8488c77f6c2610b25c00c4c92a344a32d74
+size 411443

train_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.6106676421631342,
+    "train_runtime": 67977.0297,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.983,
+    "train_steps_per_second": 0.061
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff