Adil1567 commited on
Commit
d836997
·
verified ·
1 Parent(s): 3cd1f87

Model save

Browse files
README.md CHANGED
@@ -17,6 +17,8 @@ should probably proofread and complete it, then remove this comment. -->
17
  # mistral-sft-lora-fsdp
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
 
 
20
 
21
  ## Model description
22
 
@@ -51,7 +53,7 @@ The following hyperparameters were used during training:
51
 
52
  | Training Loss | Epoch | Step | Validation Loss |
53
  |:-------------:|:-----:|:----:|:---------------:|
54
- | No log | 1.0 | 1 | 2.0744 |
55
 
56
 
57
  ### Framework versions
 
17
  # mistral-sft-lora-fsdp
18
 
19
  This model is a fine-tuned version of [meta-llama/Llama-3.1-70B-Instruct](https://huggingface.co/meta-llama/Llama-3.1-70B-Instruct) on the None dataset.
20
+ It achieves the following results on the evaluation set:
21
+ - Loss: 0.4034
22
 
23
  ## Model description
24
 
 
53
 
54
  | Training Loss | Epoch | Step | Validation Loss |
55
  |:-------------:|:-----:|:----:|:---------------:|
56
+ | 0.4001 | 1.0 | 1042 | 0.4034 |
57
 
58
 
59
  ### Framework versions
adapter_config.json CHANGED
@@ -23,13 +23,13 @@
23
  "rank_pattern": {},
24
  "revision": null,
25
  "target_modules": [
26
- "q_proj",
27
- "down_proj",
28
  "k_proj",
29
- "gate_proj",
30
- "o_proj",
31
  "v_proj",
32
- "up_proj"
 
 
 
 
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
 
23
  "rank_pattern": {},
24
  "revision": null,
25
  "target_modules": [
 
 
26
  "k_proj",
 
 
27
  "v_proj",
28
+ "q_proj",
29
+ "up_proj",
30
+ "o_proj",
31
+ "down_proj",
32
+ "gate_proj"
33
  ],
34
  "task_type": "CAUSAL_LM",
35
  "use_dora": false,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:74080fe3b4e27d8b26948ec99ea7878a5809653f6b478b100162ea3cc0fa7d6b
3
  size 828526568
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:25dc9200ba0e17c35af3d705977410efdb61bb811e0bf27e029f6de9b20025b6
3
  size 828526568
runs/Jan05_11-05-20_gpu-server/events.out.tfevents.1736075303.gpu-server.2867849.0 CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:bde0c09869ea73d4db8bfc73745b3636b2d1990c81a469bb086d575ae5b4b40b
3
- size 49395
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:6929c2a159d3cfefa871e6469674e3cf7db1ea05ffbd793c549ac635e1fa68a3
3
+ size 50020