Tristan Druyen commited on
Commit
76c8ca9
·
unverified ·
1 Parent(s): 9439554

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +18 -0
README.md CHANGED
@@ -1,3 +1,21 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ tags:
4
+ - mixtral
5
+ - conversational
6
+ - finetune
7
  ---
8
+
9
+ # Model Card for Cerebrum-1.0-8x7b-imatrix-GGUF
10
+
11
+ Quantized from https://huggingface.co/AetherResearch/Cerebrum-1.0-8x7b
12
+ using llama.cpp commit 46acb3676718b983157058aecf729a2064fc7d34 utilizing an importance matrix.
13
+
14
+ Quants will be upload with slow german internet so they will appear 1 by 1, stay tuned.
15
+
16
+ imatrix generated with:
17
+
18
+ ./imatrix -ofreq 4 -b 512 -c 512 -t 14 --chunks 24 -m ../models/Cerebrum-1.0-8x7b-GGUF/cerebrum-1.0-8x7b-Q8_0.gguf -f ./groups_merged.txt
19
+
20
+ Sadly this means the imatrix is generated from the Q8 instead of the unquantized f16, like it should be, sadly I can't get it to work with the f16 on my machine at the moment. It should still improve the performance of the quants though.
21
+