bartowski commited on
Commit
c924fc7
1 Parent(s): 3b44bbb

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -0
README.md CHANGED
@@ -23,6 +23,10 @@ base_model: deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct
23
  **Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
24
  **GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
25
 
 
 
 
 
26
  ## Model Summary:
27
 
28
  This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
@@ -42,6 +46,7 @@ This will format the prompt as follows:
42
  User: {user_message}
43
 
44
  Assistant: {assistant_message}
 
45
 
46
  ## Technical Details
47
 
 
23
  **Original model**: [DeepSeek-Coder-V2-Lite-Instruct](https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)<br>
24
  **GGUF quantization:** provided by [bartowski](https://huggingface.co/bartowski) based on `llama.cpp` release [b3166](https://github.com/ggerganov/llama.cpp/releases/tag/b3166)<br>
25
 
26
+ ## Model Settings:
27
+
28
+ Flash attention MUST be **disabled** for this model to work.
29
+
30
  ## Model Summary:
31
 
32
  This is a brand new Mixture of Export (MoE) model from DeepSeek, specializing in coding instructions.<br>
 
46
  User: {user_message}
47
 
48
  Assistant: {assistant_message}
49
+ ```
50
 
51
  ## Technical Details
52