mlabonne commited on
Commit
be69189
β€’
1 Parent(s): 7f578a1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -25
README.md CHANGED
@@ -12,11 +12,7 @@ datasets:
12
 
13
  # NeuralMarcoro14-7B
14
 
15
- This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp)
16
-
17
- This model is a merge of the following models made with [mergekit](https://github.com/cg123/mergekit):
18
- * [AIDC-ai-business/Marcoroni-7B-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3)
19
- * [EmbeddedLLM/Mistral-7B-Merge-14-v0.1](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1)
20
 
21
  ## πŸ† Evaluation
22
 
@@ -26,26 +22,30 @@ This model is a merge of the following models made with [mergekit](https://githu
26
  |[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
27
  |Change | -0.07| -0.07| +1.79| +1.26| +0.73|
28
 
29
- ## 🧩 Configuration
30
-
31
- ```yaml
32
- slices:
33
- - sources:
34
- - model: AIDC-ai-business/Marcoroni-7B-v3
35
- layer_range: [0, 32]
36
- - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.1
37
- layer_range: [0, 32]
38
- merge_method: slerp
39
- base_model: AIDC-ai-business/Marcoroni-7B-v3
40
- parameters:
41
- t:
42
- - filter: self_attn
43
- value: [0, 0.5, 0.3, 0.7, 1]
44
- - filter: mlp
45
- value: [1, 0.5, 0.7, 0.3, 0]
46
- - value: 0.5
47
- dtype: bfloat16
48
- ```
 
 
 
 
49
 
50
  ## πŸ’» Usage
51
 
 
12
 
13
  # NeuralMarcoro14-7B
14
 
15
+ This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp). It improves the performance of the model on Nous benchmark suite (waiting for the results on the Open LLM Benchmark).
 
 
 
 
16
 
17
  ## πŸ† Evaluation
18
 
 
22
  |[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)| 44.59| 76.17| 65.94| 46.9| 58.4|
23
  |Change | -0.07| -0.07| +1.79| +1.26| +0.73|
24
 
25
+ ## 🧩 Training hyperparameters
26
+
27
+ **LoRA**:
28
+ * r=16
29
+ * lora_alpha=16
30
+ * lora_dropout=0.05
31
+ * bias="none"
32
+ * task_type="CAUSAL_LM"
33
+ * target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
34
+
35
+ **Training arguments**:
36
+ * per_device_train_batch_size=4
37
+ * gradient_accumulation_steps=4
38
+ * gradient_checkpointing=True
39
+ * learning_rate=5e-5
40
+ * lr_scheduler_type="cosine"
41
+ * max_steps=200
42
+ * optim="paged_adamw_32bit"
43
+ * warmup_steps=100
44
+
45
+ **DPOTrainer**:
46
+ * beta=0.1
47
+ * max_prompt_length=1024
48
+ * max_length=1536
49
 
50
  ## πŸ’» Usage
51