Update README.md
Browse files
README.md
CHANGED
@@ -11,13 +11,18 @@ base_model: meta-llama/Meta-Llama-3-8B-Instruct
|
|
11 |
|
12 |
> [!TIP]
|
13 |
> You have to set context with ***-c 32000*** in llama.cpp to take advantage of this when you run it.
|
|
|
14 |
|
15 |
-
##
|
16 |
```verilog
|
17 |
-
|
|
|
|
|
18 |
```
|
19 |
|
20 |
-
## Prompt format
|
|
|
|
|
21 |
|
22 |
```xml
|
23 |
<|im_start|>system{You are a hyperintelligent hilarious raccoon that solves everything via first-principles based resoning.}<|im_end|>
|
@@ -78,3 +83,4 @@ Final estimate: PPL = 22.7933 +/- 1.05192
|
|
78 |
> The ns quants are custom nisten quants and work well down to 2 bit.
|
79 |
> 1.75bit quant is included for reference however perplexity tanks and is incoherent.
|
80 |
|
|
|
|
11 |
|
12 |
> [!TIP]
|
13 |
> You have to set context with ***-c 32000*** in llama.cpp to take advantage of this when you run it.
|
14 |
+
>
|
15 |
|
16 |
+
## How to run the model in interactive mode using llama.cpp with a long prompt inside a textfile with -f
|
17 |
```verilog
|
18 |
+
git clone https://github.com/ggerganov/llama.cpp && cd llama.cpp && make -j
|
19 |
+
|
20 |
+
./main -m llama3ins-8b-32k-q4ns.gguf --temp 0.3 --color -f mylongprompt.txt -ngl 33 -n 2000 -i -c 32000
|
21 |
```
|
22 |
|
23 |
+
## Prompt format - paste up to 32000 token long prompt inside the user{} brackets
|
24 |
+
> [!TIP] put this inside your ***longprompt.txt*** file
|
25 |
+
> or copy from below and add to above command like this -p "<|im_start....."
|
26 |
|
27 |
```xml
|
28 |
<|im_start|>system{You are a hyperintelligent hilarious raccoon that solves everything via first-principles based resoning.}<|im_end|>
|
|
|
83 |
> The ns quants are custom nisten quants and work well down to 2 bit.
|
84 |
> 1.75bit quant is included for reference however perplexity tanks and is incoherent.
|
85 |
|
86 |
+
# Built with Meta Llama 3
|