Model Description

This is an experiment to compare merging 2 models using DARE TIES versus SLERP 🦙

We are mainly interested to compare against Weyaxi/OpenHermes-2.5-neural-chat-v3-3-Slerp

The 2 models involved in the merge as follows:

  1. teknium/OpenHermes-2.5-Mistral-7B
  2. Intel/neural-chat-7b-v3-3

The yaml config file for the merge is:

models:
  - model: mistralai/Mistral-7B-v0.1
    # no parameters necessary for base model
  - model: teknium/OpenHermes-2.5-Mistral-7B
    parameters:
      weight: 0.5
      density: 0.5
  - model: Intel/neural-chat-7b-v3-3
    parameters:
      weight: 0.5
      density: 0.5
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
  int8_mask: true
dtype: bfloat16

Open LLM Leaderboard

Note that with more tuning DARE TIES might achieve better results.

DARE TIES SLERP
Average 70.69 71.38
ARC 67.49 68.09
HellaSwag 85.78 86.2
MMLU 64.1 64.26
TruthfulQA 60.52 62.78
Winogrande 79.01 79.16
GSM8K 67.25 67.78
Downloads last month
70
Safetensors
Model size
7.24B params
Tensor type
BF16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for EmbeddedLLM/Mistral-7B-Merge-02-v0