Model save

Browse files

Files changed (6) hide show

README.md +118 -0
adapter_model.safetensors +1 -1
all_results.json +8 -0
runs/Jul16_03-11-53_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100016.notebook-deployment-48-7d9b6c99-p5kv4.69986.0 +2 -2
train_results.json +8 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,118 @@

+---
+base_model: alignment-handbook/zephyr-7b-sft-full
+library_name: peft
+license: apache-2.0
+tags:
+- trl
+- dpo
+- generated_from_trainer
+model-index:
+- name: zephyr-dpo-qlora-uf-ours-uffull-5e-6
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-dpo-qlora-uf-ours-uffull-5e-6
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.4950
+- Rewards/chosen: -1.7859
+- Rewards/rejected: -2.8799
+- Rewards/accuracies: 0.7480
+- Rewards/margins: 1.0940
+- Rewards/margins Max: 3.5863
+- Rewards/margins Min: -0.9706
+- Rewards/margins Std: 1.5436
+- Logps/rejected: -553.8444
+- Logps/chosen: -463.0418
+- Logits/rejected: -1.5527
+- Logits/chosen: -1.6196
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-06
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 2
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- total_eval_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6903        | 0.02  | 100  | 0.6905          | 0.0096         | 0.0042           | 0.6635             | 0.0055          | 0.0279              | -0.0135             | 0.0138              | -265.4348      | -283.4918    | -2.7667         | -2.8015       |
+| 0.6668        | 0.05  | 200  | 0.6714          | 0.0249         | -0.0232          | 0.6645             | 0.0481          | 0.2299              | -0.1105             | 0.1130              | -268.1768      | -281.9665    | -2.7343         | -2.7676       |
+| 0.6136        | 0.07  | 300  | 0.6388          | -0.2723        | -0.4201          | 0.6695             | 0.1478          | 0.6956              | -0.3145             | 0.3388              | -307.8617      | -311.6826    | -2.6777         | -2.7086       |
+| 0.6224        | 0.1   | 400  | 0.6072          | -0.4408        | -0.7266          | 0.6825             | 0.2858          | 1.2193              | -0.5526             | 0.5951              | -338.5125      | -328.5356    | -2.5218         | -2.5541       |
+| 0.5913        | 0.12  | 500  | 0.5700          | -0.6299        | -1.0928          | 0.6975             | 0.4629          | 1.7719              | -0.6554             | 0.8141              | -375.1356      | -347.4472    | -2.1793         | -2.2226       |
+| 0.5721        | 0.14  | 600  | 0.5595          | -1.1081        | -1.7353          | 0.7145             | 0.6271          | 2.2934              | -0.8628             | 1.0597              | -439.3786      | -395.2698    | -2.0549         | -2.1036       |
+| 0.4888        | 0.17  | 700  | 0.5546          | -1.4460        | -2.1425          | 0.7085             | 0.6965          | 2.5873              | -0.9396             | 1.1811              | -480.1024      | -429.0589    | -1.7782         | -1.8362       |
+| 0.4774        | 0.19  | 800  | 0.5258          | -1.2110        | -1.9801          | 0.7270             | 0.7691          | 2.5889              | -0.8329             | 1.1591              | -463.8646      | -405.5573    | -1.9074         | -1.9645       |
+| 0.521         | 0.22  | 900  | 0.5286          | -1.4043        | -2.2106          | 0.7355             | 0.8063          | 2.8030              | -0.8890             | 1.2406              | -486.9130      | -424.8805    | -1.5390         | -1.5999       |
+| 0.4871        | 0.24  | 1000 | 0.5354          | -1.0617        | -1.8924          | 0.7250             | 0.8307          | 2.9996              | -0.8983             | 1.3137              | -455.0902      | -390.6243    | -1.7795         | -1.8273       |
+| 0.5574        | 0.26  | 1100 | 0.5379          | -1.2560        | -2.0556          | 0.7205             | 0.7996          | 3.0463              | -0.8879             | 1.3085              | -471.4182      | -410.0581    | -1.6403         | -1.6951       |
+| 0.5017        | 0.29  | 1200 | 0.5261          | -1.3320        | -2.1724          | 0.7295             | 0.8404          | 2.9985              | -0.8951             | 1.3031              | -483.0894      | -417.6535    | -1.7025         | -1.7570       |
+| 0.4478        | 0.31  | 1300 | 0.5277          | -1.7254        | -2.6499          | 0.7230             | 0.9245          | 3.2834              | -1.0237             | 1.4394              | -530.8426      | -456.9910    | -1.7244         | -1.7779       |
+| 0.4919        | 0.34  | 1400 | 0.5189          | -1.1742        | -2.0426          | 0.7365             | 0.8684          | 3.0337              | -0.9052             | 1.3302              | -470.1158      | -401.8751    | -1.5533         | -1.6223       |
+| 0.4792        | 0.36  | 1500 | 0.5205          | -1.3947        | -2.3310          | 0.7340             | 0.9364          | 3.1265              | -0.9863             | 1.3913              | -498.9553      | -423.9220    | -1.6972         | -1.7596       |
+| 0.4952        | 0.38  | 1600 | 0.5316          | -1.8397        | -2.8176          | 0.7290             | 0.9779          | 3.2675              | -1.0997             | 1.4769              | -547.6121      | -468.4282    | -1.8293         | -1.8827       |
+| 0.5084        | 0.41  | 1700 | 0.5285          | -2.4336        | -3.4484          | 0.7295             | 1.0147          | 3.4046              | -1.1112             | 1.5199              | -610.6892      | -527.8181    | -1.5473         | -1.6112       |
+| 0.4676        | 0.43  | 1800 | 0.5162          | -1.8360        | -2.7043          | 0.7370             | 0.8683          | 2.8969              | -0.9280             | 1.2953              | -536.2840      | -468.0518    | -1.5045         | -1.5680       |
+| 0.4588        | 0.45  | 1900 | 0.5073          | -1.5345        | -2.4614          | 0.7435             | 0.9269          | 3.0227              | -0.9141             | 1.3341              | -511.9908      | -437.9078    | -1.3109         | -1.3855       |
+| 0.4826        | 0.48  | 2000 | 0.5104          | -1.6277        | -2.6050          | 0.7385             | 0.9773          | 3.2595              | -0.9829             | 1.4282              | -526.3553      | -447.2241    | -1.3208         | -1.3956       |
+| 0.4925        | 0.5   | 2100 | 0.5079          | -1.6078        | -2.5256          | 0.7355             | 0.9178          | 2.9879              | -0.9518             | 1.3324              | -518.4150      | -445.2356    | -1.5277         | -1.5931       |
+| 0.546         | 0.53  | 2200 | 0.5100          | -1.7097        | -2.6882          | 0.7370             | 0.9785          | 3.1492              | -1.0011             | 1.4117              | -534.6687      | -455.4216    | -1.4247         | -1.4938       |
+| 0.4958        | 0.55  | 2300 | 0.5047          | -1.4824        | -2.3935          | 0.7385             | 0.9111          | 2.9984              | -0.8454             | 1.2951              | -505.2043      | -432.6925    | -1.6758         | -1.7328       |
+| 0.4757        | 0.57  | 2400 | 0.5021          | -1.6699        | -2.6304          | 0.7380             | 0.9605          | 3.1590              | -0.8924             | 1.3656              | -528.8900      | -451.4436    | -1.4670         | -1.5347       |
+| 0.4539        | 0.6   | 2500 | 0.5025          | -1.7424        | -2.7890          | 0.7400             | 1.0466          | 3.4316              | -1.0034             | 1.5001              | -544.7556      | -458.6970    | -1.5551         | -1.6231       |
+| 0.4612        | 0.62  | 2600 | 0.4991          | -1.7503        | -2.8124          | 0.7415             | 1.0621          | 3.4721              | -0.9695             | 1.5041              | -547.0907      | -459.4844    | -1.4927         | -1.5622       |
+| 0.5267        | 0.65  | 2700 | 0.4989          | -1.5988        | -2.5869          | 0.7410             | 0.9881          | 3.2210              | -0.9401             | 1.4114              | -524.5454      | -444.3344    | -1.5476         | -1.6161       |
+| 0.4999        | 0.67  | 2800 | 0.4974          | -1.6001        | -2.5954          | 0.7470             | 0.9953          | 3.2272              | -0.8964             | 1.3973              | -525.3958      | -444.4690    | -1.5260         | -1.5935       |
+| 0.4589        | 0.69  | 2900 | 0.4977          | -1.7829        | -2.8625          | 0.7415             | 1.0796          | 3.5812              | -0.9488             | 1.5304              | -552.1008      | -462.7464    | -1.5484         | -1.6154       |
+| 0.4433        | 0.72  | 3000 | 0.4995          | -1.7820        | -2.8827          | 0.7395             | 1.1007          | 3.6468              | -0.9945             | 1.5727              | -554.1236      | -462.6560    | -1.5922         | -1.6589       |
+| 0.4908        | 0.74  | 3100 | 0.4970          | -1.7323        | -2.7993          | 0.7415             | 1.0669          | 3.5268              | -0.9553             | 1.5148              | -545.7810      | -457.6894    | -1.6165         | -1.6807       |
+| 0.4325        | 0.77  | 3200 | 0.4972          | -1.3958        | -2.4076          | 0.75               | 1.0117          | 3.3475              | -0.9045             | 1.4383              | -506.6104      | -424.0385    | -1.6999         | -1.7600       |
+| 0.4645        | 0.79  | 3300 | 0.4970          | -1.7218        | -2.8037          | 0.7485             | 1.0819          | 3.5295              | -0.9807             | 1.5290              | -546.2211      | -456.6324    | -1.5845         | -1.6505       |
+| 0.4612        | 0.81  | 3400 | 0.4980          | -1.8787        | -2.9919          | 0.7445             | 1.1132          | 3.6640              | -1.0013             | 1.5776              | -565.0459      | -472.3241    | -1.4980         | -1.5678       |
+| 0.4023        | 0.84  | 3500 | 0.4987          | -2.0641        | -3.1949          | 0.7410             | 1.1308          | 3.7331              | -1.0134             | 1.6034              | -585.3400      | -490.8608    | -1.4923         | -1.5625       |
+| 0.4564        | 0.86  | 3600 | 0.4952          | -1.8890        | -2.9834          | 0.7445             | 1.0943          | 3.5913              | -0.9690             | 1.5435              | -564.1885      | -473.3587    | -1.5268         | -1.5955       |
+| 0.4337        | 0.89  | 3700 | 0.4948          | -1.7899        | -2.8791          | 0.7480             | 1.0892          | 3.5650              | -0.9671             | 1.5348              | -553.7646      | -463.4457    | -1.5501         | -1.6174       |
+| 0.4687        | 0.91  | 3800 | 0.4949          | -1.7971        | -2.8908          | 0.7475             | 1.0937          | 3.5845              | -0.9702             | 1.5427              | -554.9319      | -464.1627    | -1.5573         | -1.6238       |
+| 0.4624        | 0.93  | 3900 | 0.4946          | -1.7588        | -2.8495          | 0.7480             | 1.0908          | 3.5789              | -0.9633             | 1.5386              | -550.8040      | -460.3306    | -1.5625         | -1.6288       |
+| 0.4744        | 0.96  | 4000 | 0.4948          | -1.7812        | -2.8753          | 0.7470             | 1.0941          | 3.5851              | -0.9685             | 1.5428              | -553.3815      | -462.5721    | -1.5573         | -1.6239       |
+| 0.4294        | 0.98  | 4100 | 0.4950          | -1.7859        | -2.8799          | 0.7480             | 1.0940          | 3.5863              | -0.9706             | 1.5436              | -553.8444      | -463.0418    | -1.5527         | -1.6196       |
+### Framework versions
+- PEFT 0.7.1
+- Transformers 4.39.0.dev0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.6
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1992a83470b48bff2c8a813a26eb35fd3afa6f351f65417a6aef66dd23a51fee
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:6fd30e2b05960084925d57c9fa3171701519489ba3824ef4c00f18195828f52a
 size 671150064

all_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.5053737083043175,
+    "train_runtime": 67897.0602,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.984,
+    "train_steps_per_second": 0.062
+}

runs/Jul16_03-11-53_notebook-deployment-48-7d9b6c99-p5kv4/events.out.tfevents.1721100016.notebook-deployment-48-7d9b6c99-p5kv4.69986.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d8b00ece567f8d04796e344bddd33a8f31795efa1c268e3b52b4057390930e70
-size 404929

 version https://git-lfs.github.com/spec/v1
+oid sha256:ab181843a9ddb46178c7aac5f9ea9a21b6ea7ff61ed657b2c0e969db79605860
+size 411443

train_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.5053737083043175,
+    "train_runtime": 67897.0602,
+    "train_samples": 66812,
+    "train_samples_per_second": 0.984,
+    "train_steps_per_second": 0.062
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff