Adil1567 commited on
Commit
c1a6580
·
verified ·
1 Parent(s): fc87222

Model save

Browse files
README.md CHANGED
@@ -17,6 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
17
  # mistral-sft-lora-fsdp
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
 
 
20
 
21
  ## Model description
22
 
@@ -51,7 +53,7 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | No log | 1.0 | 1 | 1.8099 |
55
 
56
 
57
  ### Framework versions
 
17
  # mistral-sft-lora-fsdp
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 1.0178
22
 
23
  ## Model description
24
 
 
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
+ | 0.9809 | 1.0 | 20 | 1.0178 |
57
 
58
 
59
  ### Framework versions
adapter_config.json CHANGED
@@ -23,13 +23,13 @@
23
  "rank_pattern": {},
24
  "revision": null,
25
  "target_modules": [
26
- "up_proj",
27
- "k_proj",
28
- "q_proj",
29
- "gate_proj",
30
  "v_proj",
31
  "o_proj",
32
- "down_proj"
 
 
 
 
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
 
23
  "rank_pattern": {},
24
  "revision": null,
25
  "target_modules": [
 
 
 
 
26
  "v_proj",
27
  "o_proj",
28
+ "q_proj",
29
+ "k_proj",
30
+ "down_proj",
31
+ "up_proj",
32
+ "gate_proj"
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:e4e7c70b69f5bfd1902d13f38ebf24e6dbe1c5233de6df1a2d214c2a7b782e24
3
  size 414337624
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:3af779e366c638a194f86a7c739fa2410c3b76dd8c4205710484d9f4b88d3599
3
  size 414337624
runs/Jan04_17-38-23_gpu-server/events.out.tfevents.1736012482.gpu-server.1862567.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:4a1c5dc6118a7a9abe49a44a217baaabaa153b52e19d0b26032ada60ed197032
3
- size 6470
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:57295ff647b5d41c062a3de66762749f9ff2a408eb27fa102652dcba0ba2d7c1
3
+ size 7084