mav23 commited on
Commit
36caf2b
1 Parent(s): a9fb90f

Upload folder using huggingface_hub

Browse files
Files changed (3) hide show
  1. .gitattributes +1 -0
  2. README.md +292 -0
  3. ms-schisandra-22b-v0.2.Q4_0.gguf +3 -0
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ ms-schisandra-22b-v0.2.Q4_0.gguf filter=lfs diff=lfs merge=lfs -text
README.md ADDED
@@ -0,0 +1,292 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ license: other
5
+ library_name: transformers
6
+ tags:
7
+ - mergekit
8
+ - merge
9
+ base_model:
10
+ - unsloth/Mistral-Small-Instruct-2409
11
+ - Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
12
+ - anthracite-org/magnum-v4-22b
13
+ - ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
14
+ - spow12/ChatWaifu_v2.0_22B
15
+ - rAIfle/Acolyte-22B
16
+ - Envoid/Mistral-Small-NovusKyver
17
+ - InferenceIllusionist/SorcererLM-22B
18
+ - allura-org/MS-Meadowlark-22B
19
+ - crestf411/MS-sunfall-v0.7.0
20
+ model-index:
21
+ - name: MS-Schisandra-22B-v0.2
22
+ results:
23
+ - task:
24
+ type: text-generation
25
+ name: Text Generation
26
+ dataset:
27
+ name: IFEval (0-Shot)
28
+ type: HuggingFaceH4/ifeval
29
+ args:
30
+ num_few_shot: 0
31
+ metrics:
32
+ - type: inst_level_strict_acc and prompt_level_strict_acc
33
+ value: 63.83
34
+ name: strict accuracy
35
+ source:
36
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
37
+ name: Open LLM Leaderboard
38
+ - task:
39
+ type: text-generation
40
+ name: Text Generation
41
+ dataset:
42
+ name: BBH (3-Shot)
43
+ type: BBH
44
+ args:
45
+ num_few_shot: 3
46
+ metrics:
47
+ - type: acc_norm
48
+ value: 40.61
49
+ name: normalized accuracy
50
+ source:
51
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
52
+ name: Open LLM Leaderboard
53
+ - task:
54
+ type: text-generation
55
+ name: Text Generation
56
+ dataset:
57
+ name: MATH Lvl 5 (4-Shot)
58
+ type: hendrycks/competition_math
59
+ args:
60
+ num_few_shot: 4
61
+ metrics:
62
+ - type: exact_match
63
+ value: 19.94
64
+ name: exact match
65
+ source:
66
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
67
+ name: Open LLM Leaderboard
68
+ - task:
69
+ type: text-generation
70
+ name: Text Generation
71
+ dataset:
72
+ name: GPQA (0-shot)
73
+ type: Idavidrein/gpqa
74
+ args:
75
+ num_few_shot: 0
76
+ metrics:
77
+ - type: acc_norm
78
+ value: 11.41
79
+ name: acc_norm
80
+ source:
81
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
82
+ name: Open LLM Leaderboard
83
+ - task:
84
+ type: text-generation
85
+ name: Text Generation
86
+ dataset:
87
+ name: MuSR (0-shot)
88
+ type: TAUR-Lab/MuSR
89
+ args:
90
+ num_few_shot: 0
91
+ metrics:
92
+ - type: acc_norm
93
+ value: 10.67
94
+ name: acc_norm
95
+ source:
96
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
97
+ name: Open LLM Leaderboard
98
+ - task:
99
+ type: text-generation
100
+ name: Text Generation
101
+ dataset:
102
+ name: MMLU-PRO (5-shot)
103
+ type: TIGER-Lab/MMLU-Pro
104
+ config: main
105
+ split: test
106
+ args:
107
+ num_few_shot: 5
108
+ metrics:
109
+ - type: acc
110
+ value: 34.85
111
+ name: accuracy
112
+ source:
113
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Nohobby/MS-Schisandra-22B-v0.2
114
+ name: Open LLM Leaderboard
115
+ ---
116
+ ***
117
+ ## Schisandra
118
+
119
+ Many thanks to the authors of the models used!
120
+
121
+ [RPMax v1.1](https://huggingface.co/ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1) | [Pantheon-RP](https://huggingface.co/Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small) | [UnslopSmall-v1](https://huggingface.co/TheDrummer/UnslopSmall-22B-v1) | [Magnum V4](https://huggingface.co/anthracite-org/magnum-v4-22b) | [ChatWaifu v2.0](https://huggingface.co/spow12/ChatWaifu_v2.0_22B) | [SorcererLM](https://huggingface.co/InferenceIllusionist/SorcererLM-22B) | [Acolyte](https://huggingface.co/rAIfle/Acolyte-22B) | [NovusKyver](https://huggingface.co/Envoid/Mistral-Small-NovusKyver) | [Meadowlark](https://huggingface.co/allura-org/MS-Meadowlark-22B) | [Sunfall](https://huggingface.co/crestf411/MS-sunfall-v0.7.0)
122
+ ***
123
+
124
+ ### Overview
125
+
126
+ Main uses: RP, Storywriting
127
+
128
+ Prompt format: Mistral-V3
129
+
130
+ An intelligent model that is attentive to details and has a low-slop writing style. This time with a stable tokenizer.
131
+
132
+ Oh, and it now contains 10 finetunes! Not sure if some of them actually contribute to the output, but it's nice to see the numbers growing.
133
+
134
+ ***
135
+
136
+ ### Quants
137
+
138
+ GGUF: [Static](https://huggingface.co/mradermacher/MS-Schisandra-22B-v0.2-GGUF) | [Imatrix](https://huggingface.co/mradermacher/MS-Schisandra-22B-v0.2-i1-GGUF)
139
+
140
+ exl2: [4.65bpw](https://huggingface.co/waldie/MS-Schisandra-22B-v0.2-4.65bpw-h6-exl2) [5.5bpw](https://huggingface.co/waldie/MS-Schisandra-22B-v0.2-5.5bpw-h6-exl2) [6.5bpw](https://huggingface.co/waldie/MS-Schisandra-22B-v0.2-6.5bpw-h6-exl2)
141
+
142
+ ***
143
+
144
+ ### Settings
145
+
146
+ My SillyTavern preset: https://huggingface.co/Nohobby/MS-Schisandra-22B-v0.2/resolve/main/ST-formatting-Schisandra.json
147
+
148
+ ***
149
+
150
+ ## Merge Details
151
+ ### Merging steps
152
+
153
+ ## Step1
154
+ (Config partially taken from [here](https://huggingface.co/Casual-Autopsy/L3-Super-Nova-RP-8B))
155
+
156
+ ```yaml
157
+ base_model: spow12/ChatWaifu_v2.0_22B
158
+ parameters:
159
+ int8_mask: true
160
+ rescale: true
161
+ normalize: false
162
+ dtype: bfloat16
163
+ tokenizer_source: base
164
+ merge_method: della
165
+ models:
166
+ - model: Envoid/Mistral-Small-NovusKyver
167
+ parameters:
168
+ density: [0.35, 0.65, 0.5, 0.65, 0.35]
169
+ epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
170
+ lambda: 0.85
171
+ weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
172
+ - model: rAIfle/Acolyte-22B
173
+ parameters:
174
+ density: [0.6, 0.4, 0.5, 0.4, 0.6]
175
+ epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
176
+ lambda: 0.85
177
+ weight: [0.01847, -0.01468, 0.01503, -0.01822, 0.01459]
178
+ ```
179
+
180
+ ## Step2
181
+ (Config partially taken from [here](https://huggingface.co/Casual-Autopsy/L3-Super-Nova-RP-8B))
182
+
183
+ ```yaml
184
+ base_model: InferenceIllusionist/SorcererLM-22B
185
+ parameters:
186
+ int8_mask: true
187
+ rescale: true
188
+ normalize: false
189
+ dtype: bfloat16
190
+ tokenizer_source: base
191
+ merge_method: della
192
+ models:
193
+ - model: crestf411/MS-sunfall-v0.7.0
194
+ parameters:
195
+ density: [0.35, 0.65, 0.5, 0.65, 0.35]
196
+ epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
197
+ lambda: 0.85
198
+ weight: [-0.01891, 0.01554, -0.01325, 0.01791, -0.01458]
199
+ - model: anthracite-org/magnum-v4-22b
200
+ parameters:
201
+ density: [0.6, 0.4, 0.5, 0.4, 0.6]
202
+ epsilon: [0.1, 0.1, 0.25, 0.1, 0.1]
203
+ lambda: 0.85
204
+ weight: [0.01847, -0.01468, 0.01503, -0.01822, 0.01459]
205
+ ```
206
+
207
+ ## SchisandraVA2
208
+ (Config taken from [here](https://huggingface.co/HiroseKoichi/Llama-3-8B-Stroganoff-4.0))
209
+
210
+ ```yaml
211
+ merge_method: della_linear
212
+ dtype: bfloat16
213
+ parameters:
214
+ normalize: true
215
+ int8_mask: true
216
+ tokenizer_source: base
217
+ base_model: TheDrummer/UnslopSmall-22B-v1
218
+ models:
219
+ - model: ArliAI/Mistral-Small-22B-ArliAI-RPMax-v1.1
220
+ parameters:
221
+ density: 0.55
222
+ weight: 1
223
+ - model: Gryphe/Pantheon-RP-Pure-1.6.2-22b-Small
224
+ parameters:
225
+ density: 0.55
226
+ weight: 1
227
+ - model: Step1
228
+ parameters:
229
+ density: 0.55
230
+ weight: 1
231
+ - model: allura-org/MS-Meadowlark-22B
232
+ parameters:
233
+ density: 0.55
234
+ weight: 1
235
+ - model: Step2
236
+ parameters:
237
+ density: 0.55
238
+ weight: 1
239
+ ```
240
+
241
+ ## Schisandra-v0.2
242
+
243
+ ```yaml
244
+ dtype: bfloat16
245
+ tokenizer_source: base
246
+ merge_method: della_linear
247
+ parameters:
248
+ density: 0.5
249
+ base_model: SchisandraVA2
250
+ models:
251
+ - model: unsloth/Mistral-Small-Instruct-2409
252
+ parameters:
253
+ weight:
254
+ - filter: v_proj
255
+ value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
256
+ - filter: o_proj
257
+ value: [1, 0, 1, 0, 0, 0, 0, 0, 1, 1, 1]
258
+ - filter: up_proj
259
+ value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
260
+ - filter: gate_proj
261
+ value: [0, 0, 1, 1, 1, 1, 1, 1, 1, 0, 0]
262
+ - filter: down_proj
263
+ value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
264
+ - value: 0
265
+ - model: SchisandraVA2
266
+ parameters:
267
+ weight:
268
+ - filter: v_proj
269
+ value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
270
+ - filter: o_proj
271
+ value: [0, 1, 0, 1, 1, 1, 1, 1, 0, 0, 0]
272
+ - filter: up_proj
273
+ value: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]
274
+ - filter: gate_proj
275
+ value: [1, 1, 0, 0, 0, 0, 0, 0, 0, 1, 1]
276
+ - filter: down_proj
277
+ value: [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]
278
+ - value: 1
279
+ ```
280
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
281
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Nohobby__MS-Schisandra-22B-v0.2)
282
+
283
+ | Metric |Value|
284
+ |-------------------|----:|
285
+ |Avg. |30.22|
286
+ |IFEval (0-Shot) |63.83|
287
+ |BBH (3-Shot) |40.61|
288
+ |MATH Lvl 5 (4-Shot)|19.94|
289
+ |GPQA (0-shot) |11.41|
290
+ |MuSR (0-shot) |10.67|
291
+ |MMLU-PRO (5-shot) |34.85|
292
+
ms-schisandra-22b-v0.2.Q4_0.gguf ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:5d3900c77335a1aed8a88a678d5fa24bdfc3cee2064e9f28bccb4abd9fd38f21
3
+ size 12569164736