lucyknada commited on
Commit
45007c9
1 Parent(s): 5716b1c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -13
README.md CHANGED
@@ -36,19 +36,6 @@ Can I ask a question?<|im_end|>
36
  """
37
  ```
38
 
39
- ## Support
40
-
41
- To run inference on this model, you'll need to use Aphrodite, vLLM or EXL2/tabbyAPI, as llama.cpp hasn't yet merged the required pull request to fix the llama3.1 rope_freqs issue with custom head dimensions.
42
-
43
- However, you can work around this by quantizing the model yourself to create a functional GGUF file. Note that until [this PR](https://github.com/ggerganov/llama.cpp/pull/9141) is merged, the context will be limited to 8k tokens.
44
-
45
- To create a working GGUF file, make the following adjustments:
46
-
47
- 1. Remove the `"rope_scaling": {}` entry from `config.json`
48
- 2. Change `"max_position_embeddings"` to `8192` in `config.json`
49
-
50
- These modifications should allow you to use the model with llama.cpp, albeit with the mentioned context limitation.
51
-
52
  ## axolotl config
53
 
54
  <details><summary>See axolotl config</summary>
 
36
  """
37
  ```
38
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  ## axolotl config
40
 
41
  <details><summary>See axolotl config</summary>