sedrickkeh commited on
Commit
b5f4085
·
verified ·
1 Parent(s): d2c92c7

Model save

Browse files
Files changed (1) hide show
  1. README.md +5 -6
README.md CHANGED
@@ -4,7 +4,6 @@ license: apache-2.0
4
  base_model: mistralai/Mistral-7B-v0.1
5
  tags:
6
  - llama-factory
7
- - full
8
  - generated_from_trainer
9
  model-index:
10
  - name: hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
@@ -16,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
16
 
17
  # hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
18
 
19
- This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the mlfoundations-dev/oh-dcft-v3-llama3.1-nemotron-70b_shareGPT_format dataset.
20
  It achieves the following results on the evaluation set:
21
- - Loss: 0.0599
22
 
23
  ## Model description
24
 
@@ -55,9 +54,9 @@ The following hyperparameters were used during training:
55
 
56
  | Training Loss | Epoch | Step | Validation Loss |
57
  |:-------------:|:-----:|:----:|:---------------:|
58
- | 0.528 | 1.0 | 168 | 0.0656 |
59
- | 0.4738 | 2.0 | 336 | 0.0611 |
60
- | 0.4405 | 3.0 | 504 | 0.0599 |
61
 
62
 
63
  ### Framework versions
 
4
  base_model: mistralai/Mistral-7B-v0.1
5
  tags:
6
  - llama-factory
 
7
  - generated_from_trainer
8
  model-index:
9
  - name: hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
 
15
 
16
  # hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
17
 
18
+ This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 0.0602
21
 
22
  ## Model description
23
 
 
54
 
55
  | Training Loss | Epoch | Step | Validation Loss |
56
  |:-------------:|:-----:|:----:|:---------------:|
57
+ | 0.5294 | 1.0 | 168 | 0.0658 |
58
+ | 0.4791 | 2.0 | 336 | 0.0617 |
59
+ | 0.4473 | 3.0 | 504 | 0.0602 |
60
 
61
 
62
  ### Framework versions