Tuch
/

results_1

Generated from Trainer

Model card Files Files and versions Community

Tuch commited on Jul 18, 2024

Commit

f9b70de

·

verified ·

1 Parent(s): 4d0e8a9

Model save

Files changed (2) hide show

README.md +14 -14
adapter_model.safetensors +1 -1

README.md CHANGED Viewed

@@ -1,11 +1,11 @@
 ---
-base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
-library_name: peft
 license: llama3
 tags:
 - trl
 - sft
 - generated_from_trainer
 model-index:
 - name: results_1
   results: []
@@ -18,12 +18,12 @@ should probably proofread and complete it, then remove this comment. -->
 This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
-- eval_loss: 1.1364
-- eval_runtime: 81.6834
-- eval_samples_per_second: 5.497
-- eval_steps_per_second: 0.698
-- epoch: 1.8060
-- step: 270
 ## Model description
@@ -43,19 +43,19 @@ More information needed
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
-- train_batch_size: 3
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
-- total_train_batch_size: 12
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 12
 ### Framework versions
-- PEFT 0.11.1
-- Transformers 4.42.3
-- Pytorch 2.3.1+cu121
 - Datasets 2.18.0
-- Tokenizers 0.19.1

 ---
 license: llama3
+library_name: peft
 tags:
 - trl
 - sft
 - generated_from_trainer
+base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
 model-index:
 - name: results_1
   results: []
 This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- eval_loss: 1.2106
+- eval_runtime: 45.0962
+- eval_samples_per_second: 9.956
+- eval_steps_per_second: 1.264
+- epoch: 9.07
+- step: 510
 ## Model description
 The following hyperparameters were used during training:
 - learning_rate: 1e-05
+- train_batch_size: 8
 - eval_batch_size: 8
 - seed: 42
 - gradient_accumulation_steps: 4
+- total_train_batch_size: 32
 - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
 - lr_scheduler_type: linear
 - num_epochs: 12
 ### Framework versions
+- PEFT 0.10.0
+- Transformers 4.39.1
+- Pytorch 2.3.0+cu121
 - Datasets 2.18.0
+- Tokenizers 0.15.2

adapter_model.safetensors CHANGED Viewed

@@ -1,3 +1,3 @@
 version https://git-lfs.github.com/spec/v1
-oid sha256:b59973061cd09c1f9f7769963bfb9cf50f9b88627f0b737972d115150aac17b6
 size 125889008

 version https://git-lfs.github.com/spec/v1
+oid sha256:afc669b0729a2aa6a6ae21c453fb27830ce3e380ede846de8456d64e5f014c0b
 size 125889008