TheBloke commited on
Commit
9d77cac
1 Parent(s): 5371a97

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +20 -2
README.md CHANGED
@@ -10,6 +10,18 @@ model_creator: AIDC-ai-business
10
  model_name: Marcoroni 7b
11
  model_type: llama
12
  pipeline_tag: text-generation
 
 
 
 
 
 
 
 
 
 
 
 
13
  quantized_by: TheBloke
14
  ---
15
 
@@ -61,17 +73,23 @@ Here is an incomplate list of clients and libraries that are known to support GG
61
  <!-- repositories-available start -->
62
  ## Repositories available
63
 
 
64
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Marcoroni-7b-GPTQ)
65
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Marcoroni-7b-GGUF)
66
  * [AIDC-ai-business's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/AIDC-ai-business/Marcoroni-7b)
67
  <!-- repositories-available end -->
68
 
69
  <!-- prompt-template start -->
70
- ## Prompt template: Unknown
71
 
72
  ```
 
 
 
73
  {prompt}
74
 
 
 
75
  ```
76
 
77
  <!-- prompt-template end -->
@@ -193,7 +211,7 @@ Windows CLI users: Use `set HUGGINGFACE_HUB_ENABLE_HF_TRANSFER=1` before running
193
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
194
 
195
  ```shell
196
- ./main -ngl 32 -m marcoroni-7b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "{prompt}"
197
  ```
198
 
199
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.
 
10
  model_name: Marcoroni 7b
11
  model_type: llama
12
  pipeline_tag: text-generation
13
+ prompt_template: 'Below is an instruction that describes a task. Write a response
14
+ that appropriately completes the request.
15
+
16
+
17
+ ### Instruction:
18
+
19
+ {prompt}
20
+
21
+
22
+ ### Response:
23
+
24
+ '
25
  quantized_by: TheBloke
26
  ---
27
 
 
73
  <!-- repositories-available start -->
74
  ## Repositories available
75
 
76
+ * [AWQ model(s) for GPU inference.](https://huggingface.co/TheBloke/Marcoroni-7b-AWQ)
77
  * [GPTQ models for GPU inference, with multiple quantisation parameter options.](https://huggingface.co/TheBloke/Marcoroni-7b-GPTQ)
78
  * [2, 3, 4, 5, 6 and 8-bit GGUF models for CPU+GPU inference](https://huggingface.co/TheBloke/Marcoroni-7b-GGUF)
79
  * [AIDC-ai-business's original unquantised fp16 model in pytorch format, for GPU inference and for further conversions](https://huggingface.co/AIDC-ai-business/Marcoroni-7b)
80
  <!-- repositories-available end -->
81
 
82
  <!-- prompt-template start -->
83
+ ## Prompt template: Alpaca
84
 
85
  ```
86
+ Below is an instruction that describes a task. Write a response that appropriately completes the request.
87
+
88
+ ### Instruction:
89
  {prompt}
90
 
91
+ ### Response:
92
+
93
  ```
94
 
95
  <!-- prompt-template end -->
 
211
  Make sure you are using `llama.cpp` from commit [d0cee0d36d5be95a0d9088b674dbb27354107221](https://github.com/ggerganov/llama.cpp/commit/d0cee0d36d5be95a0d9088b674dbb27354107221) or later.
212
 
213
  ```shell
214
+ ./main -ngl 32 -m marcoroni-7b.q4_K_M.gguf --color -c 4096 --temp 0.7 --repeat_penalty 1.1 -n -1 -p "Below is an instruction that describes a task. Write a response that appropriately completes the request.\n\n### Instruction:\n{prompt}\n\n### Response:"
215
  ```
216
 
217
  Change `-ngl 32` to the number of layers to offload to GPU. Remove it if you don't have GPU acceleration.