mrm8488
/

llama-2-coder-7b

Text Generation

Generated from Trainer

text-generation-inference

Model card Files Files and versions

mrm8488 commited on Jul 26, 2023

Commit

9af29c5

·

1 Parent(s): c4b1a7a

Update README.md

Files changed (1) hide show

README.md +26 -7

README.md CHANGED Viewed

@@ -39,17 +39,36 @@ Meta developed and publicly released the Llama 2 family of large language models
 ### Training hyperparameters ⚙
-TBA
 ### Training results 🗒️
 | Step | Training Loss | Validation Loss |
-|------|---------------|-----------------|
-| 100  | 0.798500      | 0.767996        |
-| 200  | 0.725900      | 0.749880        |
-| 300  | 0.669100      | 0.748029        |
-| 400  | 0.687300      | 0.742342        |
-| 500  | 0.579900      | 0.736735        |

 ### Training hyperparameters ⚙
+```py
+    optim="paged_adamw_32bit",
+    num_train_epochs = 2,
+    eval_steps=50,
+    save_steps=50,
+    evaluation_strategy="steps",
+    save_strategy="steps",
+    save_total_limit=2,
+    seed=66,
+    load_best_model_at_end=True,
+    logging_steps=1,
+    learning_rate=2e-4,
+    fp16=True,
+    bf16=False,
+    max_grad_norm=0.3,
+    warmup_ratio=0.03,
+    group_by_length=True,
+    lr_scheduler_type="constant"
+```
 ### Training results 🗒️
 | Step | Training Loss | Validation Loss |
+|------|----------|----------|
+| 50   | 0.624400 | 0.600070 |
+| 100  | 0.634100 | 0.592757 |
+| 150  | 0.545800 | 0.586652 |
+| 200  | 0.572500 | 0.577525 |
+| 250  | 0.528000 | 0.590118 |