File size: 2,568 Bytes
24e35df
 
 
 
 
 
 
ec6c669
 
c8025fe
 
24e35df
 
070ee88
24e35df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e19187e
24e35df
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
e19187e
24e35df
 
 
9f35129
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
---
library_name: transformers
tags:
- llama-3
license: cc-by-nc-4.0
---

![image/png](https://cdn-uploads.huggingface.co/production/uploads/65b19c1b098c85365af5a83e/mnKtH1BMVHFAHZVEp3rQv.png)

[GGUF Quants](https://huggingface.co/mradermacher/l3-badger-mushroom-4x8b-i1-GGUF)

# Badger Mushroom 4x8b

I've been really impressed with how well these frankenmoe models quant compared to the base llama 8b, but with far better speed than the 70b.  8x8b seemed a bit unneccessary for how much additonal value it brougt, so I dialed it back to a 4x8b version.  This model feels pretty good out of the gate, which considering how I used a non-standard merge; is a bit surprising.

```
base_model: ./maldv/badger
gate_mode: hidden 
dtype: bfloat16
experts_per_token: 2
experts:
  - source_model: ./models/instruct/Llama-3-SauerkrautLM-8b-Instruct
    positive_prompts:
        <some words>
    negative_prompts:
        <some words>
  - source_model: ./models/instruct/opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5
    positive_prompts:
        <some words>
    negative_prompts:
        <some words>
  - source_model: ./models/instruct/Llama-3-8B-Instruct-DPO-v0.4
    positive_prompts:
        <some words>
    negative_prompts:
        <some words>
  - source_model: ./models/instruct/Poppy_Porpoise-0.72-L3-8B
    positive_prompts:
        <some words>
    negative_prompts:
        <some words>
```

### Badger

Badger is a cascading [fourier interpolation](./tensor.py#3) of the following models, with the merge order based on the pairwise layer cosine similarity:

```python
[
 'opus-v1.2-llama-3-8b-instruct-run3.5-epoch2.5',
 'Llama-3-SauerkrautLM-8b-Instruct',
 'Llama-3-8B-Instruct-DPO-v0.4',
 'Roleplay-Llama-3-8B',
 'Llama-3-Lumimaid-8B-v0.1',
 'Poppy_Porpoise-0.72-L3-8B',
 'L3-TheSpice-8b-v0.8.3',
 'Llama-3-LewdPlay-8B-evo',
 'Llama-3-8B-Instruct-norefusal',
 'Meta-Llama-3-8B-Instruct-DPO',
 'Llama-3-Soliloquy-8B-v2'
]
```

I'm finding my iq4_nl to be working well.  Llama 3 instruct format works really well, but minimal format is also highly creative.  So far it performs well in each of the four areas of roleplay, logic, writing, and assistant behaviors that I've tested it in.

## Scores

Not too bad; similar to the other *highly recommended* [L3-Arcania-4x8b](https://huggingface.co/Steelskull/L3-Arcania-4x8b).

Metric | Score
---|---
Average | 67.09
ARC | 61.69
HellaSwag | 81.33
MMLU | 66.37
TruthfulQA | 49.82
Winogrande | 77.43
GSM8K | 65.88

[Details](https://huggingface.co/datasets/open-llm-leaderboard/details_maldv__l3-badger-mushroom-4x8b)