Tuch commited on
Commit
f9b70de
·
verified ·
1 Parent(s): 4d0e8a9

Model save

Browse files
Files changed (2) hide show
  1. README.md +14 -14
  2. adapter_model.safetensors +1 -1
README.md CHANGED
@@ -1,11 +1,11 @@
1
  ---
2
- base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
3
- library_name: peft
4
  license: llama3
 
5
  tags:
6
  - trl
7
  - sft
8
  - generated_from_trainer
 
9
  model-index:
10
  - name: results_1
11
  results: []
@@ -18,12 +18,12 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
- - eval_loss: 1.1364
22
- - eval_runtime: 81.6834
23
- - eval_samples_per_second: 5.497
24
- - eval_steps_per_second: 0.698
25
- - epoch: 1.8060
26
- - step: 270
27
 
28
  ## Model description
29
 
@@ -43,19 +43,19 @@ More information needed
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 1e-05
46
- - train_batch_size: 3
47
  - eval_batch_size: 8
48
  - seed: 42
49
  - gradient_accumulation_steps: 4
50
- - total_train_batch_size: 12
51
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
  - lr_scheduler_type: linear
53
  - num_epochs: 12
54
 
55
  ### Framework versions
56
 
57
- - PEFT 0.11.1
58
- - Transformers 4.42.3
59
- - Pytorch 2.3.1+cu121
60
  - Datasets 2.18.0
61
- - Tokenizers 0.19.1
 
1
  ---
 
 
2
  license: llama3
3
+ library_name: peft
4
  tags:
5
  - trl
6
  - sft
7
  - generated_from_trainer
8
+ base_model: scb10x/llama-3-typhoon-v1.5-8b-instruct
9
  model-index:
10
  - name: results_1
11
  results: []
 
18
 
19
  This model is a fine-tuned version of [scb10x/llama-3-typhoon-v1.5-8b-instruct](https://huggingface.co/scb10x/llama-3-typhoon-v1.5-8b-instruct) on an unknown dataset.
20
  It achieves the following results on the evaluation set:
21
+ - eval_loss: 1.2106
22
+ - eval_runtime: 45.0962
23
+ - eval_samples_per_second: 9.956
24
+ - eval_steps_per_second: 1.264
25
+ - epoch: 9.07
26
+ - step: 510
27
 
28
  ## Model description
29
 
 
43
 
44
  The following hyperparameters were used during training:
45
  - learning_rate: 1e-05
46
+ - train_batch_size: 8
47
  - eval_batch_size: 8
48
  - seed: 42
49
  - gradient_accumulation_steps: 4
50
+ - total_train_batch_size: 32
51
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
52
  - lr_scheduler_type: linear
53
  - num_epochs: 12
54
 
55
  ### Framework versions
56
 
57
+ - PEFT 0.10.0
58
+ - Transformers 4.39.1
59
+ - Pytorch 2.3.0+cu121
60
  - Datasets 2.18.0
61
+ - Tokenizers 0.15.2
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:b59973061cd09c1f9f7769963bfb9cf50f9b88627f0b737972d115150aac17b6
3
  size 125889008
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:afc669b0729a2aa6a6ae21c453fb27830ce3e380ede846de8456d64e5f014c0b
3
  size 125889008