shuttie commited on
Commit
d2a8978
1 Parent(s): e86d018

update model quantization

Browse files
.gitattributes CHANGED
@@ -32,5 +32,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
32
  *.xz filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
- *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  *.gguf filter=lfs diff=lfs merge=lfs -text
 
32
  *.xz filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
 
35
  *.gguf filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -59,7 +59,7 @@ Fine-tuning took ~70 minutes on a single RTX 4090.
59
  This model can be run with a [llama-cpp](https://github.com/ggerganov/llama.cpp) on a CPU using the following command:
60
 
61
  ```
62
- ./main -n 64 -m models/ggml-model-q4.gguf -p "[INST] My girlfriend changed after she became a vegetarian. [/INST]"
63
 
64
  system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
65
  sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
 
59
  This model can be run with a [llama-cpp](https://github.com/ggerganov/llama.cpp) on a CPU using the following command:
60
 
61
  ```
62
+ ./main -n 64 -m models/ggml-model-q4_0.gguf -p "[INST] My girlfriend changed after she became a vegetarian. [/INST]"
63
 
64
  system_info: n_threads = 8 / 16 | AVX = 1 | AVX2 = 1 | AVX512 = 0 | AVX512_VBMI = 0 | AVX512_VNNI = 0 | FMA = 1 | NEON = 0 | ARM_FMA = 0 | F16C = 1 | FP16_VA = 0 | WASM_SIMD = 0 | BLAS = 0 | SSE3 = 1 | SSSE3 = 1 | VSX = 0 |
65
  sampling: repeat_last_n = 64, repeat_penalty = 1.100000, presence_penalty = 0.000000, frequency_penalty = 0.000000, top_k = 40, tfs_z = 1.000000, top_p = 0.950000, typical_p = 1.000000, temp = 0.800000, mirostat = 0, mirostat_lr = 0.100000, mirostat_ent = 5.000000
ggml-model-q4.gguf → ggml-model-q4_0.gguf RENAMED
File without changes