koesn commited on
Commit
39f0906
1 Parent(s): 949060b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +70 -0
README.md CHANGED
@@ -1,3 +1,73 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+ # Mistral-7B-v0.1
5
+
6
+ ## Description
7
+ This repo contains GGUF format model files for Mistral-7B-v0.1.
8
+
9
+ ## Files Provided
10
+ | Name | Quant | Bits | File Size | Remark |
11
+ | ---------------------------- | ------- | ---- | --------- | -------------------------------- |
12
+ | mistral-7b-v0.1.IQ3_XXS.gguf | IQ3_XXS | 3 | 3.02 GB | 3.06 bpw quantization |
13
+ | mistral-7b-v0.1.IQ3_S.gguf | IQ3_S | 3 | 3.18 GB | 3.44 bpw quantization |
14
+ | mistral-7b-v0.1.IQ3_M.gguf | IQ3_M | 3 | 3.28 GB | 3.66 bpw quantization mix |
15
+ | mistral-7b-v0.1.IQ4_NL.gguf | IQ4_NL | 4 | 4.16 GB | 4.25 bpw non-linear quantization |
16
+ | mistral-7b-v0.1.Q4_K_M.gguf | Q4_K_M | 4 | 4.37 GB | 3.80G, +0.0532 ppl |
17
+ | mistral-7b-v0.1.Q5_K_M.gguf | Q5_K_M | 5 | 5.13 GB | 4.45G, +0.0122 ppl |
18
+ | mistral-7b-v0.1.Q6_K.gguf | Q6_K | 6 | 5.94 GB | 5.15G, +0.0008 ppl |
19
+ | mistral-7b-v0.1.Q8_0.gguf | Q8_0 | 8 | 7.70 GB | 6.70G, +0.0004 ppl |
20
+
21
+ ## Parameters
22
+ | path | type | architecture | rope_theta | sliding_win | max_pos_embed |
23
+ | ---------------------------------- | ------- | ------------------ | ---------- | ----------- | ------------- |
24
+ | mistralai/Mistral-7B-Instruct-v0.2 | mistral | MistralForCausalLM | 10000.0 | 4096 | 32768 |
25
+
26
+ # Original Model Card
27
+
28
+ ---
29
+ license: apache-2.0
30
+ pipeline_tag: text-generation
31
+ language:
32
+ - en
33
+ tags:
34
+ - pretrained
35
+ inference:
36
+ parameters:
37
+ temperature: 0.7
38
+ ---
39
+
40
+ # Model Card for Mistral-7B-v0.1
41
+
42
+ The Mistral-7B-v0.1 Large Language Model (LLM) is a pretrained generative text model with 7 billion parameters.
43
+ Mistral-7B-v0.1 outperforms Llama 2 13B on all benchmarks we tested.
44
+
45
+ For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/announcing-mistral-7b/).
46
+
47
+ ## Model Architecture
48
+
49
+ Mistral-7B-v0.1 is a transformer model, with the following architecture choices:
50
+ - Grouped-Query Attention
51
+ - Sliding-Window Attention
52
+ - Byte-fallback BPE tokenizer
53
+
54
+ ## Troubleshooting
55
+
56
+ - If you see the following error:
57
+ ```
58
+ KeyError: 'mistral'
59
+ ```
60
+ - Or:
61
+ ```
62
+ NotImplementedError: Cannot copy out of meta tensor; no data!
63
+ ```
64
+
65
+ Ensure you are utilizing a stable version of Transformers, 4.34.0 or newer.
66
+
67
+ ## Notice
68
+
69
+ Mistral 7B is a pretrained base model and therefore does not have any moderation mechanisms.
70
+
71
+ ## The Mistral AI Team
72
+
73
+ Albert Jiang, Alexandre Sablayrolles, Arthur Mensch, Chris Bamford, Devendra Singh Chaplot, Diego de las Casas, Florian Bressand, Gianna Lengyel, Guillaume Lample, Lélio Renard Lavaud, Lucile Saulnier, Marie-Anne Lachaux, Pierre Stock, Teven Le Scao, Thibaut Lavril, Thomas Wang, Timothée Lacroix, William El Sayed.