Update README.md
Browse files
README.md
CHANGED
@@ -12,11 +12,7 @@ datasets:
|
|
12 |
|
13 |
# NeuralMarcoro14-7B
|
14 |
|
15 |
-
This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp)
|
16 |
-
|
17 |
-
This model is a merge of the following models made with [mergekit](https://github.com/cg123/mergekit):
|
18 |
-
* [AIDC-ai-business/Marcoroni-7B-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3)
|
19 |
-
* [EmbeddedLLM/Mistral-7B-Merge-14-v0.1](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1)
|
20 |
|
21 |
## π Evaluation
|
22 |
|
@@ -26,26 +22,30 @@ This model is a merge of the following models made with [mergekit](https://githu
|
|
26 |
|[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
|
27 |
|Change | -0.07| -0.07| +1.79| +1.26| +0.73|
|
28 |
|
29 |
-
## π§©
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
39 |
-
|
40 |
-
|
41 |
-
|
42 |
-
|
43 |
-
|
44 |
-
|
45 |
-
|
46 |
-
|
47 |
-
|
48 |
-
|
|
|
|
|
|
|
|
|
49 |
|
50 |
## π» Usage
|
51 |
|
|
|
12 |
|
13 |
# NeuralMarcoro14-7B
|
14 |
|
15 |
+
This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp). It improves the performance of the model on Nous benchmark suite (waiting for the results on the Open LLM Benchmark).
|
|
|
|
|
|
|
|
|
16 |
|
17 |
## π Evaluation
|
18 |
|
|
|
22 |
|[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
|
23 |
|Change | -0.07| -0.07| +1.79| +1.26| +0.73|
|
24 |
|
25 |
+
## 𧩠Training hyperparameters
|
26 |
+
|
27 |
+
**LoRA**:
|
28 |
+
* r=16
|
29 |
+
* lora_alpha=16
|
30 |
+
* lora_dropout=0.05
|
31 |
+
* bias="none"
|
32 |
+
* task_type="CAUSAL_LM"
|
33 |
+
* target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
|
34 |
+
|
35 |
+
**Training arguments**:
|
36 |
+
* per_device_train_batch_size=4
|
37 |
+
* gradient_accumulation_steps=4
|
38 |
+
* gradient_checkpointing=True
|
39 |
+
* learning_rate=5e-5
|
40 |
+
* lr_scheduler_type="cosine"
|
41 |
+
* max_steps=200
|
42 |
+
* optim="paged_adamw_32bit"
|
43 |
+
* warmup_steps=100
|
44 |
+
|
45 |
+
**DPOTrainer**:
|
46 |
+
* beta=0.1
|
47 |
+
* max_prompt_length=1024
|
48 |
+
* max_length=1536
|
49 |
|
50 |
## π» Usage
|
51 |
|