mlabonne
/

NeuralMarcoro14-7B

Text Generation

mlabonne/Marcoro14-7B-slerp

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

mlabonne commited on Jan 6

Commit

be69189

•

1 Parent(s): 7f578a1

Update README.md

Files changed (1) hide show

README.md +25 -25

README.md CHANGED Viewed

@@ -12,11 +12,7 @@ datasets:
 # NeuralMarcoro14-7B
-This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp)
-This model is a merge of the following models made with [mergekit](https://github.com/cg123/mergekit):
- * [AIDC-ai-business/Marcoroni-7B-v3](https://huggingface.co/AIDC-ai-business/Marcoroni-7B-v3)
- * [EmbeddedLLM/Mistral-7B-Merge-14-v0.1](https://huggingface.co/EmbeddedLLM/Mistral-7B-Merge-14-v0.1)
 ## 🏆 Evaluation
@@ -26,26 +22,30 @@ This model is a merge of the following models made with [mergekit](https://githu
 |[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)|  44.59|  76.17|     65.94|    46.9|   58.4|
 |Change                   |  -0.07|  -0.07|    +1.79|   +1.26|   +0.73|
-## 🧩 Configuration
-```yaml
-slices:
-  - sources:
-      - model: AIDC-ai-business/Marcoroni-7B-v3
-        layer_range: [0, 32]
-      - model: EmbeddedLLM/Mistral-7B-Merge-14-v0.1
-        layer_range: [0, 32]
-merge_method: slerp
-base_model: AIDC-ai-business/Marcoroni-7B-v3
-parameters:
-  t:
-    - filter: self_attn
-      value: [0, 0.5, 0.3, 0.7, 1]
-    - filter: mlp
-      value: [1, 0.5, 0.7, 0.3, 0]
-    - value: 0.5
-dtype: bfloat16
-```
 ## 💻 Usage

 # NeuralMarcoro14-7B
+This is a DPO fine-tune version of [mlabonne/Marcoro14-7B-slerp](https://huggingface.co/mlabonne/Marcoro14-7B-slerp). It improves the performance of the model on Nous benchmark suite (waiting for the results on the Open LLM Benchmark).
 ## 🏆 Evaluation
 |[NeuralMarcoro14-7B](https://huggingface.co/mlabonne/NeuralMarcoro14-7B)|  44.59|  76.17|     65.94|    46.9|   58.4|
 |Change                   |  -0.07|  -0.07|    +1.79|   +1.26|   +0.73|
+## 🧩 Training hyperparameters
+**LoRA**:
+* r=16
+* lora_alpha=16
+* lora_dropout=0.05
+* bias="none"
+* task_type="CAUSAL_LM"
+* target_modules=['k_proj', 'gate_proj', 'v_proj', 'up_proj', 'q_proj', 'o_proj', 'down_proj']
+**Training arguments**:
+* per_device_train_batch_size=4
+* gradient_accumulation_steps=4
+* gradient_checkpointing=True
+* learning_rate=5e-5
+* lr_scheduler_type="cosine"
+* max_steps=200
+* optim="paged_adamw_32bit"
+* warmup_steps=100
+**DPOTrainer**:
+* beta=0.1
+* max_prompt_length=1024
+* max_length=1536
 ## 💻 Usage