MaziyarPanahi commited on
Commit
c750257
β€’
1 Parent(s): b0415da

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +16 -2
README.md CHANGED
@@ -20,9 +20,21 @@ tags:
20
  - moe
21
  ---
22
 
 
 
23
  # Mixtral-8x22B-v0.1-GGUF
24
 
25
- in progress ...
 
 
 
 
 
 
 
 
 
 
26
 
27
  ## Load sharded model
28
 
@@ -81,7 +93,9 @@ Since this appears to be a base model, it will keep on generating.
81
 
82
  ## Credit
83
 
84
- Thank you [MistralAI](https://huggingface.co/mistralai) for opening the weights and thank you [v2ray](https://huggingface.co/v2ray/) for converting and preparing the [Mixtral-8x22B-v0.1](https://huggingface.co/v2ray/Mixtral-8x22B-v0.1) model.
 
 
85
 
86
  β–„β–„β–„β–‘β–‘
87
  β–„β–„β–„β–„β–„β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘
 
20
  - moe
21
  ---
22
 
23
+ <img src="./mixtral-8x22b.jpeg" width="600" />
24
+
25
  # Mixtral-8x22B-v0.1-GGUF
26
 
27
+ On April 10th, @MistralAI released a model named "Mixtral 8x22B," an 176B MoE via magnet link (torrent):
28
+
29
+ - 176B MoE with ~40B active
30
+ - Context length of 65k tokens
31
+ - The base model can be fine-tuned
32
+ - Requires ~260GB VRAM in fp16, 73GB in int4
33
+ - Licensed under Apache 2.0, according to their Discord
34
+ - Available on @huggingface (community)
35
+ - Utilizes a tokenizer similar to previous models
36
+
37
+ The GGUF and quantized models here are based on [v2ray/Mixtral-8x22B-v0.1](https://huggingface.co/v2ray/Mixtral-8x22B-v0.1) model
38
 
39
  ## Load sharded model
40
 
 
93
 
94
  ## Credit
95
 
96
+ - [MistralAI](https://huggingface.co/mistralai) for opening the weights
97
+ - [v2ray](https://huggingface.co/v2ray/) for downloading, converting, and sharing it with the community [Mixtral-8x22B-v0.1](https://huggingface.co/v2ray/Mixtral-8x22B-v0.1)
98
+ - [philschmid]([https://huggingface.co/philschmid) for the photo he shared on his Twitter
99
 
100
  β–„β–„β–„β–‘β–‘
101
  β–„β–„β–„β–„β–„β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–‘β–‘β–‘β–‘