mlfoundations-dev
/

hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7

Text Generation

Generated from Trainer

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

sedrickkeh commited on 29 days ago

Commit

b5f4085

·

verified ·

1 Parent(s): d2c92c7

Model save

Files changed (1) hide show

README.md +5 -6

README.md CHANGED Viewed

@@ -4,7 +4,6 @@ license: apache-2.0
 base_model: mistralai/Mistral-7B-v0.1
 tags:
 - llama-factory
-- full
 - generated_from_trainer
 model-index:
 - name: hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
@@ -16,9 +15,9 @@ should probably proofread and complete it, then remove this comment. -->
 # hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
-This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on the mlfoundations-dev/oh-dcft-v3-llama3.1-nemotron-70b_shareGPT_format dataset.
 It achieves the following results on the evaluation set:
-- Loss: 0.0599
 ## Model description
@@ -55,9 +54,9 @@ The following hyperparameters were used during training:
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
-| 0.528         | 1.0   | 168  | 0.0656          |
-| 0.4738        | 2.0   | 336  | 0.0611          |
-| 0.4405        | 3.0   | 504  | 0.0599          |
 ### Framework versions

 base_model: mistralai/Mistral-7B-v0.1
 tags:
 - llama-factory
 - generated_from_trainer
 model-index:
 - name: hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
 # hp_ablations_grid_mistral_bsz2048_lr2e-6_scheduler-cosine-warmup0.15-minlr5e-7
+This model is a fine-tuned version of [mistralai/Mistral-7B-v0.1](https://huggingface.co/mistralai/Mistral-7B-v0.1) on an unknown dataset.
 It achieves the following results on the evaluation set:
+- Loss: 0.0602
 ## Model description
 | Training Loss | Epoch | Step | Validation Loss |
 |:-------------:|:-----:|:----:|:---------------:|
+| 0.5294        | 1.0   | 168  | 0.0658          |
+| 0.4791        | 2.0   | 336  | 0.0617          |
+| 0.4473        | 3.0   | 504  | 0.0602          |
 ### Framework versions