lucaelin commited on
Commit
4449636
·
verified ·
1 Parent(s): f4a0cc5

End of training

Browse files
Files changed (2) hide show
  1. README.md +13 -12
  2. model.safetensors +1 -1
README.md CHANGED
@@ -1,7 +1,5 @@
1
  ---
2
  tags:
3
- - trl
4
- - sft
5
  - generated_from_trainer
6
  model-index:
7
  - name: minillama2
@@ -15,12 +13,12 @@ should probably proofread and complete it, then remove this comment. -->
15
 
16
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
17
  It achieves the following results on the evaluation set:
18
- - eval_loss: 8.6515
19
- - eval_runtime: 0.0987
20
- - eval_samples_per_second: 101.294
21
- - eval_steps_per_second: 10.129
22
- - epoch: 0.0567
23
- - step: 34000
24
 
25
  ## Model description
26
 
@@ -39,13 +37,16 @@ More information needed
39
  ### Training hyperparameters
40
 
41
  The following hyperparameters were used during training:
42
- - learning_rate: 0.0001
43
  - train_batch_size: 32
44
- - eval_batch_size: 16
45
  - seed: 42
 
 
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
- - lr_scheduler_type: linear
48
- - training_steps: 600000.0
 
49
 
50
  ### Framework versions
51
 
 
1
  ---
2
  tags:
 
 
3
  - generated_from_trainer
4
  model-index:
5
  - name: minillama2
 
13
 
14
  This model is a fine-tuned version of [](https://huggingface.co/) on an unknown dataset.
15
  It achieves the following results on the evaluation set:
16
+ - eval_loss: 3.9366
17
+ - eval_runtime: 0.8539
18
+ - eval_samples_per_second: 149.905
19
+ - eval_steps_per_second: 2.342
20
+ - epoch: 0.1147
21
+ - step: 4300
22
 
23
  ## Model description
24
 
 
37
  ### Training hyperparameters
38
 
39
  The following hyperparameters were used during training:
40
+ - learning_rate: 0.0005
41
  - train_batch_size: 32
42
+ - eval_batch_size: 64
43
  - seed: 42
44
+ - gradient_accumulation_steps: 8
45
+ - total_train_batch_size: 256
46
  - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
47
+ - lr_scheduler_type: cosine
48
+ - lr_scheduler_warmup_steps: 100
49
+ - training_steps: 37500.0
50
 
51
  ### Framework versions
52
 
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:db4634f8628007f7d029bf4bae32af527075b2d64f1e5348329382be52f1ead7
3
  size 100963088
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:f144a986bd7ce2fc8ac8f5a6a309585a3a28bcad354d83a90a67c634fb45387f
3
  size 100963088