Model save

Browse files

Files changed (6) hide show

README.md +114 -0
adapter_model.safetensors +1 -1
all_results.json +8 -0
runs/Jul22_23-14-44_node26/events.out.tfevents.1721657973.node26.3249241.0 +2 -2
train_results.json +8 -0
trainer_state.json +0 -0

README.md ADDED Viewed

	@@ -0,0 +1,114 @@

+---
+license: apache-2.0
+library_name: peft
+tags:
+- trl
+- dpo
+- generated_from_trainer
+base_model: alignment-handbook/zephyr-7b-sft-full
+model-index:
+- name: zephyr-7b-dpo-uffull-qlora-5e-7
+  results: []
+---
+<!-- This model card has been generated automatically according to the information the Trainer had access to. You
+should probably proofread and complete it, then remove this comment. -->
+# zephyr-7b-dpo-uffull-qlora-5e-7
+This model is a fine-tuned version of [alignment-handbook/zephyr-7b-sft-full](https://huggingface.co/alignment-handbook/zephyr-7b-sft-full) on the None dataset.
+It achieves the following results on the evaluation set:
+- Loss: 0.5925
+- Rewards/chosen: -0.2517
+- Rewards/rejected: -0.6019
+- Rewards/accuracies: 0.7341
+- Rewards/margins: 0.3502
+- Rewards/margins Max: 1.2316
+- Rewards/margins Min: -0.5546
+- Rewards/margins Std: 0.6038
+- Logps/rejected: -322.3433
+- Logps/chosen: -309.6684
+- Logits/rejected: -2.6801
+- Logits/chosen: -2.7126
+## Model description
+More information needed
+## Intended uses & limitations
+More information needed
+## Training and evaluation data
+More information needed
+## Training procedure
+### Training hyperparameters
+The following hyperparameters were used during training:
+- learning_rate: 5e-07
+- train_batch_size: 4
+- eval_batch_size: 8
+- seed: 42
+- distributed_type: multi-GPU
+- num_devices: 4
+- total_train_batch_size: 16
+- total_eval_batch_size: 32
+- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
+- lr_scheduler_type: cosine
+- lr_scheduler_warmup_ratio: 0.1
+- num_epochs: 1
+### Training results
+| Training Loss | Epoch | Step | Validation Loss | Rewards/chosen | Rewards/rejected | Rewards/accuracies | Rewards/margins | Rewards/margins Max | Rewards/margins Min | Rewards/margins Std | Logps/rejected | Logps/chosen | Logits/rejected | Logits/chosen |
+|:-------------:|:-----:|:----:|:---------------:|:--------------:|:----------------:|:------------------:|:---------------:|:-------------------:|:-------------------:|:-------------------:|:--------------:|:------------:|:---------------:|:-------------:|
+| 0.6929        | 0.03  | 100  | 0.6930          | 0.0001         | -0.0003          | 0.5377             | 0.0004          | 0.0054              | -0.0041             | 0.0032              | -262.1841      | -284.4886    | -2.7819         | -2.8200       |
+| 0.6922        | 0.05  | 200  | 0.6923          | 0.0008         | -0.0010          | 0.6627             | 0.0019          | 0.0100              | -0.0058             | 0.0051              | -262.2543      | -284.4120    | -2.7814         | -2.8195       |
+| 0.6908        | 0.08  | 300  | 0.6903          | 0.0041         | -0.0025          | 0.7143             | 0.0066          | 0.0281              | -0.0141             | 0.0137              | -262.3995      | -284.0884    | -2.7806         | -2.8185       |
+| 0.689         | 0.1   | 400  | 0.6870          | 0.0093         | -0.0046          | 0.7183             | 0.0140          | 0.0586              | -0.0282             | 0.0285              | -262.6125      | -283.5621    | -2.7783         | -2.8162       |
+| 0.6813        | 0.13  | 500  | 0.6813          | 0.0235         | -0.0040          | 0.7242             | 0.0275          | 0.1137              | -0.0534             | 0.0551              | -262.5450      | -282.1426    | -2.7758         | -2.8132       |
+| 0.6712        | 0.16  | 600  | 0.6742          | 0.0200         | -0.0247          | 0.7262             | 0.0447          | 0.1814              | -0.0859             | 0.0884              | -264.6151      | -282.4901    | -2.7638         | -2.8015       |
+| 0.6643        | 0.18  | 700  | 0.6653          | 0.0004         | -0.0668          | 0.7242             | 0.0672          | 0.2707              | -0.1305             | 0.1329              | -268.8295      | -284.4591    | -2.7558         | -2.7925       |
+| 0.6421        | 0.21  | 800  | 0.6562          | -0.0231        | -0.1154          | 0.7222             | 0.0923          | 0.3706              | -0.1761             | 0.1820              | -273.6847      | -286.8017    | -2.7519         | -2.7880       |
+| 0.648         | 0.24  | 900  | 0.6480          | -0.0748        | -0.1938          | 0.7183             | 0.1190          | 0.4823              | -0.2242             | 0.2359              | -281.5314      | -291.9791    | -2.7477         | -2.7835       |
+| 0.6547        | 0.26  | 1000 | 0.6378          | -0.0763        | -0.2278          | 0.7183             | 0.1515          | 0.5995              | -0.2816             | 0.2954              | -284.9341      | -292.1262    | -2.7446         | -2.7798       |
+| 0.6408        | 0.29  | 1100 | 0.6317          | -0.0432        | -0.2136          | 0.7262             | 0.1704          | 0.6414              | -0.2953             | 0.3163              | -283.5132      | -288.8173    | -2.7545         | -2.7885       |
+| 0.6358        | 0.31  | 1200 | 0.6260          | -0.0529        | -0.2480          | 0.7183             | 0.1952          | 0.7219              | -0.3249             | 0.3520              | -286.9514      | -289.7809    | -2.7585         | -2.7914       |
+| 0.6297        | 0.34  | 1300 | 0.6215          | -0.1213        | -0.3378          | 0.7143             | 0.2165          | 0.8114              | -0.3727             | 0.4028              | -295.9312      | -296.6275    | -2.7489         | -2.7816       |
+| 0.6165        | 0.37  | 1400 | 0.6213          | -0.2177        | -0.4420          | 0.7103             | 0.2243          | 0.8626              | -0.4022             | 0.4264              | -306.3474      | -306.2648    | -2.7404         | -2.7733       |
+| 0.6185        | 0.39  | 1500 | 0.6162          | -0.1021        | -0.3356          | 0.7063             | 0.2335          | 0.8779              | -0.3976             | 0.4349              | -295.7101      | -294.7082    | -2.7425         | -2.7745       |
+| 0.6066        | 0.42  | 1600 | 0.6141          | -0.1696        | -0.4256          | 0.7123             | 0.2560          | 0.9394              | -0.4398             | 0.4678              | -304.7078      | -301.4554    | -2.7367         | -2.7689       |
+| 0.6048        | 0.44  | 1700 | 0.6123          | -0.1220        | -0.3748          | 0.7123             | 0.2529          | 0.9411              | -0.4235             | 0.4656              | -299.6321      | -296.6920    | -2.7315         | -2.7638       |
+| 0.609         | 0.47  | 1800 | 0.6090          | -0.1424        | -0.4122          | 0.7282             | 0.2698          | 0.9829              | -0.4478             | 0.4813              | -303.3703      | -298.7344    | -2.7251         | -2.7574       |
+| 0.5909        | 0.5   | 1900 | 0.6062          | -0.2373        | -0.5239          | 0.7183             | 0.2866          | 1.0475              | -0.4860             | 0.5181              | -314.5422      | -308.2264    | -2.7186         | -2.7507       |
+| 0.6011        | 0.52  | 2000 | 0.6048          | -0.1288        | -0.4109          | 0.7242             | 0.2821          | 1.0037              | -0.4627             | 0.4932              | -303.2409      | -297.3789    | -2.7100         | -2.7425       |
+| 0.6047        | 0.55  | 2100 | 0.6031          | -0.1486        | -0.4420          | 0.7262             | 0.2934          | 1.0559              | -0.4792             | 0.5193              | -306.3505      | -299.3512    | -2.7123         | -2.7448       |
+| 0.592         | 0.58  | 2200 | 0.6011          | -0.2623        | -0.5777          | 0.7242             | 0.3154          | 1.1326              | -0.5284             | 0.5638              | -319.9217      | -310.7270    | -2.7100         | -2.7423       |
+| 0.6285        | 0.6   | 2300 | 0.6022          | -0.3099        | -0.6207          | 0.7242             | 0.3108          | 1.1254              | -0.5181             | 0.5570              | -324.2166      | -315.4819    | -2.7044         | -2.7370       |
+| 0.6258        | 0.63  | 2400 | 0.6005          | -0.1642        | -0.4737          | 0.7302             | 0.3095          | 1.0716              | -0.4957             | 0.5259              | -309.5165      | -300.9170    | -2.6960         | -2.7291       |
+| 0.5855        | 0.65  | 2500 | 0.5981          | -0.2145        | -0.5381          | 0.7341             | 0.3237          | 1.1337              | -0.5235             | 0.5568              | -315.9617      | -305.9418    | -2.6924         | -2.7253       |
+| 0.6095        | 0.68  | 2600 | 0.5970          | -0.2416        | -0.5724          | 0.7262             | 0.3308          | 1.1753              | -0.5364             | 0.5756              | -319.3885      | -308.6579    | -2.6859         | -2.7187       |
+| 0.6013        | 0.71  | 2700 | 0.5961          | -0.2450        | -0.5789          | 0.7262             | 0.3340          | 1.1924              | -0.5460             | 0.5830              | -320.0433      | -308.9903    | -2.6845         | -2.7170       |
+| 0.6233        | 0.73  | 2800 | 0.5954          | -0.2426        | -0.5787          | 0.7302             | 0.3361          | 1.2015              | -0.5491             | 0.5882              | -320.0177      | -308.7550    | -2.6852         | -2.7174       |
+| 0.6119        | 0.76  | 2900 | 0.5944          | -0.2613        | -0.6032          | 0.7282             | 0.3419          | 1.2206              | -0.5595             | 0.6006              | -322.4701      | -310.6289    | -2.6853         | -2.7176       |
+| 0.5644        | 0.79  | 3000 | 0.5938          | -0.2218        | -0.5648          | 0.7282             | 0.3430          | 1.1989              | -0.5312             | 0.5872              | -318.6263      | -306.6716    | -2.6826         | -2.7150       |
+| 0.5946        | 0.81  | 3100 | 0.5932          | -0.2763        | -0.6239          | 0.7262             | 0.3476          | 1.2359              | -0.5639             | 0.6094              | -324.5376      | -312.1256    | -2.6762         | -2.7090       |
+| 0.5961        | 0.84  | 3200 | 0.5930          | -0.2713        | -0.6200          | 0.7262             | 0.3487          | 1.2365              | -0.5595             | 0.6090              | -324.1454      | -311.6203    | -2.6815         | -2.7140       |
+| 0.5841        | 0.86  | 3300 | 0.5927          | -0.2686        | -0.6177          | 0.7302             | 0.3491          | 1.2362              | -0.5602             | 0.6093              | -323.9175      | -311.3521    | -2.6834         | -2.7157       |
+| 0.611         | 0.89  | 3400 | 0.5925          | -0.2485        | -0.5979          | 0.7361             | 0.3493          | 1.2281              | -0.5496             | 0.6023              | -321.9356      | -309.3477    | -2.6821         | -2.7145       |
+| 0.5458        | 0.92  | 3500 | 0.5925          | -0.2494        | -0.5988          | 0.7341             | 0.3494          | 1.2280              | -0.5516             | 0.6025              | -322.0256      | -309.4359    | -2.6792         | -2.7118       |
+| 0.5926        | 0.94  | 3600 | 0.5925          | -0.2520        | -0.6014          | 0.7321             | 0.3494          | 1.2312              | -0.5539             | 0.6042              | -322.2860      | -309.6909    | -2.6837         | -2.7160       |
+| 0.6096        | 0.97  | 3700 | 0.5926          | -0.2517        | -0.6015          | 0.7341             | 0.3497          | 1.2313              | -0.5539             | 0.6042              | -322.2966      | -309.6683    | -2.6793         | -2.7119       |
+| 0.5865        | 0.99  | 3800 | 0.5925          | -0.2517        | -0.6019          | 0.7341             | 0.3502          | 1.2316              | -0.5546             | 0.6038              | -322.3433      | -309.6684    | -2.6801         | -2.7126       |
+### Framework versions
+- PEFT 0.7.1
+- Transformers 4.39.0.dev0
+- Pytorch 2.1.2+cu121
+- Datasets 2.14.6
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:d558ef2c50f77aad1e1e4f2a15c9b865c39fefed065fe4b8732ebfaf97f27de2
 size 671150064

 version https://git-lfs.github.com/spec/v1
+oid sha256:607b501fcfaf72a6aa3455caf733453d4d58a4f3ab30b8a8e971b9ec9b0af0c1
 size 671150064

all_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.619671463092813,
+    "train_runtime": 44477.573,
+    "train_samples": 61134,
+    "train_samples_per_second": 1.374,
+    "train_steps_per_second": 0.086
+}

runs/Jul22_23-14-44_node26/events.out.tfevents.1721657973.node26.3249241.0 CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:70baba524db99d2f0f32d22079f9c55816cce891c28519b189e1c2a246046a48
-size 375691

 version https://git-lfs.github.com/spec/v1
+oid sha256:0ab491a6461a6c2d91253b2087e7642b658580b098451f3f0607f33828d18ac2
+size 377805

train_results.json ADDED Viewed

	@@ -0,0 +1,8 @@

+{
+    "epoch": 1.0,
+    "train_loss": 0.619671463092813,
+    "train_runtime": 44477.573,
+    "train_samples": 61134,
+    "train_samples_per_second": 1.374,
+    "train_steps_per_second": 0.086
+}

trainer_state.json ADDED Viewed

The diff for this file is too large to render. See raw diff