Update README.md
Browse files
README.md
CHANGED
@@ -14,14 +14,8 @@ tags:
|
|
14 |
but fine-tuned for context lengths up to 32K using "Position interpolation" and "Rotary Position Embeddings"
|
15 |
(RoPE).
|
16 |
|
17 |
-
|
18 |
-
|
19 |
-
parameter.
|
20 |
-
|
21 |
-
> Nota bene: for the model described here the `--rope-scale` is `8` (original context size was 4k, the
|
22 |
-
> fine-tuned one is 32k)
|
23 |
-
|
24 |
-
However, llama.cpp requires quantized files in the new GGUF format - that's where this repo comes in:
|
25 |
it contains a few quantizations of the original weights from Together's fined-tuned model (as indicated by
|
26 |
the file names)
|
27 |
|
|
|
14 |
but fine-tuned for context lengths up to 32K using "Position interpolation" and "Rotary Position Embeddings"
|
15 |
(RoPE).
|
16 |
|
17 |
+
While the current version of [llama.cpp](https://github.com/ggerganov/llama.cpp) already supports such large
|
18 |
+
context lengths, it requires quantized files in the new GGUF format - and that's where this repo comes in:
|
|
|
|
|
|
|
|
|
|
|
|
|
19 |
it contains a few quantizations of the original weights from Together's fined-tuned model (as indicated by
|
20 |
the file names)
|
21 |
|