Update 2024-01-03

Check out our v0.4 model which is based on this and achieves better average score of 71.19 versus 69.66.

Model Description

This is an update to EmbeddedLLM/Mistral-7B-Merge-14-v0.2 that removes potentially TruthfulQA-contaminated models and non-commercially licensed models:

  1. berkeley-nest/Starling-LM-7B-alpha
  2. Q-bert/MetaMath-Cybertron-Starling
  3. v1olet/v1olet_marcoroni-go-bruins-merge-7B

This is an experiment to test merging 14 models using DARE TIES 🦙

The result is a base model that performs quite well but may need some further chat fine-tuning.

The 14 models are as follows:

  1. mistralai/Mistral-7B-Instruct-v0.2
  2. ehartford/dolphin-2.2.1-mistral-7b
  3. SciPhi/SciPhi-Mistral-7B-32k
  4. ehartford/samantha-1.2-mistral-7b
  5. Arc53/docsgpt-7b-mistral
  6. HuggingFaceH4/zephyr-7b-beta
  7. meta-math/MetaMath-Mistral-7B
  8. Open-Orca/Mistral-7B-OpenOrca
  9. openchat/openchat-3.5-1210
  10. beowolx/MistralHermes-CodePro-7B-v1
  11. TIGER-Lab/MAmmoTH-7B-Mistral
  12. teknium/OpenHermes-2.5-Mistral-7B
  13. Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
  14. mlabonne/NeuralHermes-2.5-Mistral-7B

Open LLM Leaderboard

v0.3 v0.4
Average 69.66 71.19
ARC 65.96 66.81
HellaSwag 85.29 86.15
MMLU 64.35 65.10
TruthfulQA 57.80 58.25
Winogrande 78.30 80.03
GSM8K 66.26 70.81

Chat Template

We tried ChatML and Llama-2 chat template, but feel free to try other templates.

Merge Configuration

The merge config file for this model is here:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: ehartford/dolphin-2.2.1-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: SciPhi/SciPhi-Mistral-7B-32k
    parameters:
      weight: 0.08
      density: 0.4
  - model: ehartford/samantha-1.2-mistral-7b
    parameters:
      weight: 0.08
      density: 0.4
  - model: Arc53/docsgpt-7b-mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: HuggingFaceH4/zephyr-7b-beta
    parameters:
      weight: 0.08
      density: 0.4
  - model: meta-math/MetaMath-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: Open-Orca/Mistral-7B-OpenOrca
    parameters:
      weight: 0.08
      density: 0.4
  - model: openchat/openchat-3.5-1210
    parameters:
      weight: 0.08
      density: 0.4
  - model: beowolx/MistralHermes-CodePro-7B-v1
    parameters:
      weight: 0.08
      density: 0.4
  - model: TIGER-Lab/MAmmoTH-7B-Mistral
    parameters:
      weight: 0.08
      density: 0.4
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp
    parameters:
      weight: 0.08
      density: 0.4
  - model: mlabonne/NeuralHermes-2.5-Mistral-7B
    parameters:
      weight: 0.08
      density: 0.4
  - model: mistralai/Mistral-7B-Instruct-v0.2
    parameters:
      weight: 0.08
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16
Downloads last month
424
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EmbeddedLLM/Mistral-7B-Merge-14-v0.3