Model save

Browse files

Files changed (5) hide show

README.md +109 -0
adapter_model.safetensors +1 -1
all_results.json +9 -0
train_results.json +9 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,109 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- dpo
+- generated_from_trainer
+base_model: alignment-handbook/zephyr-7b-sft-full
+model-index:
+- name: zephyr-7b-dpo-lora
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-7b-dpo-lora
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5893
+- Rewards/chosen: -0.2740
+- Rewards/rejected: -0.6023
+- Rewards/accuracies: 0.7025
+- Rewards/margins: 0.3283
+- Logps/rejected: -321.6666
+- Logps/chosen: -310.1333
+- Logits/rejected: -2.7525
+- Logits/chosen: -2.7742
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 8
+- eval_batch_size: 8
+- seed: 42
+- gradient_accumulation_steps: 2
+- total_train_batch_size: 16
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch  | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:------:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6929        | 0.0262 | 100  | 0.6930          | -0.0001        | -0.0004          | 0.5250             | 0.0003          | -261.4788      | -282.7496    | -2.8388         | -2.8661       |
+| 0.6923        | 0.0523 | 200  | 0.6923          | 0.0008         | -0.0009          | 0.6050             | 0.0017          | -261.5316      | -282.6624    | -2.8380         | -2.8653       |
+| 0.6898        | 0.0785 | 300  | 0.6903          | 0.0035         | -0.0024          | 0.6640             | 0.0058          | -261.6760      | -282.3918    | -2.8350         | -2.8623       |
+| 0.6872        | 0.1047 | 400  | 0.6862          | 0.0165         | 0.0021           | 0.6670             | 0.0144          | -261.2256      | -281.0900    | -2.8308         | -2.8577       |
+| 0.6783        | 0.1309 | 500  | 0.6804          | 0.0209         | -0.0059          | 0.6835             | 0.0267          | -262.0230      | -280.6481    | -2.8215         | -2.8486       |
+| 0.6729        | 0.1570 | 600  | 0.6733          | 0.0154         | -0.0272          | 0.6840             | 0.0426          | -264.1608      | -281.1958    | -2.8138         | -2.8410       |
+| 0.6665        | 0.1832 | 700  | 0.6638          | -0.0035        | -0.0689          | 0.6755             | 0.0654          | -268.3266      | -283.0863    | -2.8060         | -2.8327       |
+| 0.6427        | 0.2094 | 800  | 0.6546          | -0.0214        | -0.1104          | 0.6815             | 0.0889          | -272.4747      | -284.8825    | -2.8020         | -2.8283       |
+| 0.6428        | 0.2355 | 900  | 0.6458          | -0.0247        | -0.1383          | 0.6770             | 0.1136          | -275.2685      | -285.2050    | -2.7942         | -2.8199       |
+| 0.6381        | 0.2617 | 1000 | 0.6358          | -0.0638        | -0.2074          | 0.6785             | 0.1436          | -282.1761      | -289.1206    | -2.7887         | -2.8138       |
+| 0.6488        | 0.2879 | 1100 | 0.6284          | -0.1378        | -0.3055          | 0.6790             | 0.1677          | -291.9890      | -296.5138    | -2.7826         | -2.8071       |
+| 0.6427        | 0.3141 | 1200 | 0.6223          | -0.1104        | -0.2986          | 0.6835             | 0.1882          | -291.3028      | -293.7785    | -2.7931         | -2.8165       |
+| 0.6131        | 0.3402 | 1300 | 0.6172          | -0.1466        | -0.3514          | 0.6865             | 0.2049          | -296.5806      | -297.3945    | -2.7951         | -2.8180       |
+| 0.6326        | 0.3664 | 1400 | 0.6155          | -0.1752        | -0.3896          | 0.6860             | 0.2144          | -300.3966      | -300.2597    | -2.7920         | -2.8147       |
+| 0.6128        | 0.3926 | 1500 | 0.6180          | -0.0630        | -0.2687          | 0.6890             | 0.2057          | -288.3090      | -289.0369    | -2.7980         | -2.8198       |
+| 0.6223        | 0.4187 | 1600 | 0.6088          | -0.1688        | -0.4097          | 0.6945             | 0.2409          | -302.4074      | -299.6220    | -2.7926         | -2.8148       |
+| 0.6338        | 0.4449 | 1700 | 0.6061          | -0.2152        | -0.4665          | 0.6925             | 0.2513          | -308.0869      | -304.2535    | -2.7961         | -2.8181       |
+| 0.585         | 0.4711 | 1800 | 0.6050          | -0.1327        | -0.3850          | 0.6915             | 0.2523          | -299.9368      | -296.0054    | -2.7949         | -2.8174       |
+| 0.577         | 0.4973 | 1900 | 0.6013          | -0.2170        | -0.4883          | 0.6965             | 0.2713          | -310.2670      | -304.4333    | -2.7954         | -2.8176       |
+| 0.5945        | 0.5234 | 2000 | 0.5992          | -0.2107        | -0.4899          | 0.6995             | 0.2793          | -310.4293      | -303.8028    | -2.7903         | -2.8122       |
+| 0.5913        | 0.5496 | 2100 | 0.5981          | -0.2373        | -0.5251          | 0.7025             | 0.2879          | -313.9529      | -306.4641    | -2.7863         | -2.8085       |
+| 0.5816        | 0.5758 | 2200 | 0.5989          | -0.2688        | -0.5570          | 0.6970             | 0.2883          | -317.1411      | -309.6146    | -2.7849         | -2.8070       |
+| 0.5824        | 0.6019 | 2300 | 0.5961          | -0.2227        | -0.5189          | 0.6955             | 0.2961          | -313.3233      | -305.0098    | -2.7821         | -2.8037       |
+| 0.602         | 0.6281 | 2400 | 0.5969          | -0.2683        | -0.5669          | 0.6990             | 0.2986          | -318.1251      | -309.5652    | -2.7744         | -2.7961       |
+| 0.5792        | 0.6543 | 2500 | 0.5963          | -0.2102        | -0.5041          | 0.6975             | 0.2938          | -311.8429      | -303.7615    | -2.7763         | -2.7980       |
+| 0.6028        | 0.6805 | 2600 | 0.5974          | -0.1896        | -0.4790          | 0.6920             | 0.2895          | -309.3417      | -301.6964    | -2.7717         | -2.7933       |
+| 0.5854        | 0.7066 | 2700 | 0.5930          | -0.2517        | -0.5615          | 0.7020             | 0.3098          | -317.5864      | -307.9027    | -2.7676         | -2.7892       |
+| 0.5994        | 0.7328 | 2800 | 0.5920          | -0.2607        | -0.5775          | 0.7045             | 0.3167          | -319.1838      | -308.8107    | -2.7636         | -2.7851       |
+| 0.5837        | 0.7590 | 2900 | 0.5913          | -0.2540        | -0.5721          | 0.7055             | 0.3181          | -318.6511      | -308.1379    | -2.7619         | -2.7834       |
+| 0.5858        | 0.7851 | 3000 | 0.5910          | -0.2625        | -0.5835          | 0.7055             | 0.3210          | -319.7853      | -308.9898    | -2.7605         | -2.7819       |
+| 0.5685        | 0.8113 | 3100 | 0.5914          | -0.2383        | -0.5571          | 0.7040             | 0.3188          | -317.1507      | -306.5707    | -2.7558         | -2.7777       |
+| 0.5753        | 0.8375 | 3200 | 0.5903          | -0.2623        | -0.5868          | 0.7020             | 0.3246          | -320.1224      | -308.9666    | -2.7567         | -2.7783       |
+| 0.5769        | 0.8636 | 3300 | 0.5900          | -0.2673        | -0.5934          | 0.7030             | 0.3260          | -320.7757      | -309.4716    | -2.7555         | -2.7771       |
+| 0.5608        | 0.8898 | 3400 | 0.5896          | -0.2716        | -0.5988          | 0.7020             | 0.3273          | -321.3196      | -309.8930    | -2.7520         | -2.7739       |
+| 0.6008        | 0.9160 | 3500 | 0.5895          | -0.2716        | -0.5994          | 0.7035             | 0.3277          | -321.3745      | -309.9000    | -2.7539         | -2.7755       |
+| 0.585         | 0.9422 | 3600 | 0.5895          | -0.2722        | -0.6000          | 0.7020             | 0.3279          | -321.4418      | -309.9531    | -2.7549         | -2.7764       |
+| 0.567         | 0.9683 | 3700 | 0.5893          | -0.2738        | -0.6022          | 0.7015             | 0.3284          | -321.6555      | -310.1171    | -2.7539         | -2.7755       |
+| 0.5834        | 0.9945 | 3800 | 0.5893          | -0.2740        | -0.6023          | 0.7025             | 0.3283          | -321.6666      | -310.1333    | -2.7525         | -2.7742       |
+### Framework versions
+- PEFT 0.10.0
+- Transformers 4.40.0
+- Pytorch 2.2.0
+- Datasets 2.16.1
+- Tokenizers 0.19.1

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:1818b92cbd208058d804d1f94c779b3a7f08d2aade39d0e96b4524b7d518431a
 size 1342238560

 version https://git-lfs.github.com/spec/v1
+oid sha256:7824785b77388bdacbd438b6940d2e36888c73f044b90f65f3e52ea1d3c98100
 size 1342238560

all_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6164219083351729,
+    "train_runtime": 73481.1174,
+    "train_samples": 61134,
+    "train_samples_per_second": 0.832,
+    "train_steps_per_second": 0.052
+}

train_results.json ADDED Viewed

	@@ -0,0 +1,9 @@

+{
+    "epoch": 1.0,
+    "total_flos": 0.0,
+    "train_loss": 0.6164219083351729,
+    "train_runtime": 73481.1174,
+    "train_samples": 61134,
+    "train_samples_per_second": 0.832,
+    "train_steps_per_second": 0.052
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff