Update 2024-01-03
Check out our v0.4 model which is based on this and achieves better average score of 71.19 versus 69.66.
Model Description
This is an update to EmbeddedLLM/Mistral-7B-Merge-14-v0.2 that removes potentially TruthfulQA-contaminated models and non-commercially licensed models:
- berkeley-nest/Starling-LM-7B-alpha
- Q-bert/MetaMath-Cybertron-Starling
- v1olet/v1olet_marcoroni-go-bruins-merge-7B
This is an experiment to test merging 14 models using DARE TIES 🦙
The result is a base model that performs quite well but may need some further chat fine-tuning.
The 14 models are as follows:
- mistralai/Mistral-7B-Instruct-v0.2
- ehartford/dolphin-2.2.1-mistral-7b
- SciPhi/SciPhi-Mistral-7B-32k
- ehartford/samantha-1.2-mistral-7b
- Arc53/docsgpt-7b-mistral
- HuggingFaceH4/zephyr-7b-beta
- meta-math/MetaMath-Mistral-7B
- Open-Orca/Mistral-7B-OpenOrca
- openchat/openchat-3.5-1210
- beowolx/MistralHermes-CodePro-7B-v1
- TIGER-Lab/MAmmoTH-7B-Mistral
- teknium/OpenHermes-2.5-Mistral-7B
- Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
- mlabonne/NeuralHermes-2.5-Mistral-7B
- base model: mistralai/Mistral-7B-v0.1
Open LLM Leaderboard
v0.3 | v0.4 | |
---|---|---|
Average | 69.66 | 71.19 |
ARC | 65.96 | 66.81 |
HellaSwag | 85.29 | 86.15 |
MMLU | 64.35 | 65.10 |
TruthfulQA | 57.80 | 58.25 |
Winogrande | 78.30 | 80.03 |
GSM8K | 66.26 | 70.81 |
Chat Template
We tried ChatML and Llama-2 chat template, but feel free to try other templates.
Merge Configuration
The merge config file for this model is here:
models:
- model: mistralai/Mistral-7B-v0.1
# no parameters necessary for base model
- model: ehartford/dolphin-2.2.1-mistral-7b
parameters:
weight: 0.08
density: 0.4
- model: SciPhi/SciPhi-Mistral-7B-32k
parameters:
weight: 0.08
density: 0.4
- model: ehartford/samantha-1.2-mistral-7b
parameters:
weight: 0.08
density: 0.4
- model: Arc53/docsgpt-7b-mistral
parameters:
weight: 0.08
density: 0.4
- model: HuggingFaceH4/zephyr-7b-beta
parameters:
weight: 0.08
density: 0.4
- model: meta-math/MetaMath-Mistral-7B
parameters:
weight: 0.08
density: 0.4
- model: Open-Orca/Mistral-7B-OpenOrca
parameters:
weight: 0.08
density: 0.4
- model: openchat/openchat-3.5-1210
parameters:
weight: 0.08
density: 0.4
- model: beowolx/MistralHermes-CodePro-7B-v1
parameters:
weight: 0.08
density: 0.4
- model: TIGER-Lab/MAmmoTH-7B-Mistral
parameters:
weight: 0.08
density: 0.4
- model: teknium/OpenHermes-2.5-Mistral-7B
parameters:
weight: 0.08
density: 0.4
- model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
parameters:
weight: 0.08
density: 0.4
- model: mlabonne/NeuralHermes-2.5-Mistral-7B
parameters:
weight: 0.08
density: 0.4
- model: mistralai/Mistral-7B-Instruct-v0.2
parameters:
weight: 0.08
density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
int8_mask: true
dtype: bfloat16
- Downloads last month
- 424
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.
Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0.3
Merge model
this model