Update README.md (#6)
Browse files- Update README.md (528c40a25596a4ec57fc61768a2fe734e41088db)
README.md
CHANGED
@@ -13,7 +13,6 @@ tags:
|
|
13 |
- gguf
|
14 |
- llama cpp
|
15 |
---
|
16 |
-
|
17 |
# Octopus V4-GGUF: Graph of language models
|
18 |
|
19 |
|
@@ -32,22 +31,63 @@ tags:
|
|
32 |
**Acknowledgement**:
|
33 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
34 |
|
|
|
35 |
|
36 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
37 |
|
38 |
```bash
|
39 |
-
|
40 |
```
|
41 |
|
42 |
-
|
|
|
|
|
|
|
43 |
|
44 |
-
|
45 |
-
|
|
|
|
|
|
|
46 |
|
47 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
48 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
49 |
```
|
50 |
-
Note that `<nexa_4>` represents the math gpt.
|
51 |
|
52 |
### Dataset and Benchmark
|
53 |
|
|
|
13 |
- gguf
|
14 |
- llama cpp
|
15 |
---
|
|
|
16 |
# Octopus V4-GGUF: Graph of language models
|
17 |
|
18 |
|
|
|
31 |
**Acknowledgement**:
|
32 |
We sincerely thank our community members, [Mingyuan](https://huggingface.co/ThunderBeee) and [Zoey](https://huggingface.co/ZY6), for their extraordinary contributions to this quantization effort. Please explore [Octopus-v4](https://huggingface.co/NexaAIDev/Octopus-v4) for our original huggingface model.
|
33 |
|
34 |
+
## (Recommended) Run with [llama.cpp](https://github.com/ggerganov/llama.cpp)
|
35 |
|
36 |
+
1. **Clone and compile:**
|
37 |
+
|
38 |
+
```bash
|
39 |
+
git clone https://github.com/ggerganov/llama.cpp
|
40 |
+
cd llama.cpp
|
41 |
+
# Compile the source code:
|
42 |
+
make
|
43 |
+
```
|
44 |
+
|
45 |
+
2. **Prepare the Input Prompt File:**
|
46 |
+
|
47 |
+
Navigate to the `prompt` folder inside the `llama.cpp`, and create a new file named `chat-with-octopus.txt`.
|
48 |
+
|
49 |
+
`chat-with-octopus.txt`:
|
50 |
+
|
51 |
+
```bash
|
52 |
+
User:
|
53 |
+
```
|
54 |
+
|
55 |
+
3. **Execute the Model:**
|
56 |
+
|
57 |
+
Run the following command in the terminal:
|
58 |
|
59 |
```bash
|
60 |
+
./main -m ./path/to/octopus-v4-Q4_K_M.gguf -c 512 -b 2048 -n 256 -t 1 --repeat_penalty 1.0 --top_k 0 --top_p 1.0 --color -i -r "User:" -f prompts/chat-with-octopus.txt
|
61 |
```
|
62 |
|
63 |
+
Example prompt to interact
|
64 |
+
```bash
|
65 |
+
<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>
|
66 |
+
```
|
67 |
|
68 |
+
## Run with [Ollama](https://github.com/ollama/ollama)
|
69 |
+
1. Create a `Modelfile` in your directory and include a `FROM` statement with the path to your local model:
|
70 |
+
```bash
|
71 |
+
FROM ./path/to/octopus-v4-Q4_K_M.gguf
|
72 |
+
```
|
73 |
|
74 |
+
2. Use the following command to add the model to Ollama:
|
75 |
+
```bash
|
76 |
+
ollama create octopus-v4-Q4_K_M -f Modelfile
|
77 |
+
PARAMETER temperature 0
|
78 |
+
PARAMETER num_ctx 1024
|
79 |
+
PARAMETER stop <nexa_end>
|
80 |
+
```
|
81 |
|
82 |
+
3. Verify that the model has been successfully imported:
|
83 |
+
```bash
|
84 |
+
ollama ls
|
85 |
+
```
|
86 |
+
|
87 |
+
### Run the model
|
88 |
+
```bash
|
89 |
+
ollama run octopus-v4-Q4_K_M "<|system|>You are a router. Below is the query from the users, please call the correct function and generate the parameters to call the function.<|end|><|user|>Tell me the result of derivative of x^3 when x is 2?<|end|><|assistant|>"
|
90 |
```
|
|
|
91 |
|
92 |
### Dataset and Benchmark
|
93 |
|