rozek commited on
Commit
6a4410d
·
1 Parent(s): 53efd1d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -1
README.md CHANGED
@@ -1,3 +1,34 @@
1
  ---
2
- license: mit
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: llama2
3
+ tags:
4
+ - llama2
5
+ - quantized
6
+ - gguf
7
+ - 32k-context
8
  ---
9
+
10
+ # LLaMA-2-7B-32K #
11
+
12
+ [Together Computer, Inc.](https://together.ai/) has released
13
+ [LLaMA-2-7B-32K](https://huggingface.co/togethercomputer/LLaMA-2-7B-32K), a model based on Meta AI's LLaMA-2-7B,
14
+ but fine-tuned for context lengths up to 32K using "Position interpolation" and "Rotary Position Embeddings"
15
+ (RoPE).
16
+
17
+ The current version of [llama.cpp](https://github.com/ggerganov/llama.cpp) supports such large context lengths
18
+ by means of the new [`--rope-scale`](https://github.com/ggerganov/llama.cpp/tree/master/examples/main#extended-context-size)
19
+ parameter.
20
+
21
+ > Nota bene: for the model described here the `--rope-scale` is `8` (original context size was 4k, the
22
+ > fine-tuned one is 32k)
23
+
24
+ However, llama.cpp requires quantized files in the new GGUF format - that's where this repo comes in:
25
+ it contains a few quantizations of the original weights from Together's fined-tuned model (as indicated by
26
+ the file names)
27
+
28
+ Concerning the license(s):
29
+
30
+ * the [orignal model](https://ai.meta.com/llama/) (from Meta AI) was released under a rather [permittive
31
+ license](https://ai.meta.com/llama/license/)
32
+ * the fine tuned model from Together Computer uses the
33
+ [same license](https://huggingface.co/togethercomputer/LLaMA-2-7B-32K/blob/main/README.md)
34
+ * as a consequence, this repo does so as well