llama3.1-8b-spaetzle-v90

These are q4_k_m quants made with llama.cpp b3472 from cstr/llama3.1-8b-spaetzle-v90 which is a progressive merge of merges.

EQ-Bench v2_de: 69.93 (171/171).

The merge tree involves the following models:

  • NousResearch/Hermes-3-Llama-3.1-8B
  • Undi95/Meta-Llama-3.1-8B-Claude
  • Dampfinchen/Llama-3.1-8B-Ultra-Instruct
  • VAGOsolutions/Llama-3.1-SauerkrautLM-8b-Instruct
  • akjindal53244/Llama-3.1-Storm-8B
  • nbeerbower/llama3.1-gutenberg-8B
  • Undi95/Meta-Llama-3.1-8B-Claude
  • DiscoResearch/Llama3-DiscoLeo-Instruct-8B-v0.1
  • nbeerbower/llama-3-wissenschaft-8B-v2
  • Azure99/blossom-v5-llama3-8b
  • VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
  • princeton-nlp/Llama-3-Instruct-8B-SimPO
  • Locutusque/llama-3-neural-chat-v1-8b
  • Locutusque/Llama-3-Orca-1.0-8B
  • DiscoResearch/Llama3_DiscoLM_German_8b_v0.1_experimental
  • seedboxai/Llama-3-Kafka-8B-v0.2
  • VAGOsolutions/Llama-3-SauerkrautLM-8b-Instruct
  • nbeerbower/llama-3-wissenschaft-8B-v2
  • mlabonne/Daredevil-8B-abliterated-dpomix

There have been a number of steps involved, among which, slep merging of only middle layers compensating for tokenizer / chat template differences. An illustration below.

🧩 Configuration

The final merge for this was:

models:
  - model: cstr/llama3.1-8b-spaetzle-v59
    # no parameters necessary for base model
  - model: cstr/llama3.1-8b-spaetzle-v85
    parameters:
      density: 0.65
      weight: 0.3
  - model: cstr/llama3.1-8b-spaetzle-v86
    parameters:
      density: 0.65
      weight: 0.3
  - model: cstr/llama3.1-8b-spaetzle-v74
    parameters:
      density: 0.65
      weight: 0.3
merge_method: dare_ties
base_model: cstr/llama3.1-8b-spaetzle-v59
parameters:
  int8_mask: true
dtype: bfloat16
random_seed: 0
tokenizer_source: base

Among the previous steps:

models:
  - model: NousResearch/Hermes-3-Llama-3.1-8B
merge_method: slerp
base_model: cstr/llama3.1-8b-spaetzle-v74
parameters:
  t:
    - value: [0, 0, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.7, 0.6, 0.5, 0.4, 0.3, 0.2, 0, 0]
dtype: float16

πŸ’» Usage

Use with llama3 chat template as common. The q4km quants here are from cstr/llama3.1-8b-spaetzle-v90.

Downloads last month
2
GGUF
Model size
8.03B params
Architecture
llama

4-bit

Inference API
Unable to determine this model's library. Check the docs .

Model tree for cstr/llama3.1-8b-spaetzle-v90-GGUF

Collection including cstr/llama3.1-8b-spaetzle-v90-GGUF