Triangle104 commited on
Commit
9dea40d
·
verified ·
1 Parent(s): 95833e6

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +71 -0
README.md CHANGED
@@ -15,6 +15,77 @@ tags:
15
  This model was converted to GGUF format from [`ToastyPigeon/Ruby-Music-8B`](https://huggingface.co/ToastyPigeon/Ruby-Music-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ToastyPigeon/Ruby-Music-8B) for more details on the model.
17
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
  ## Use with llama.cpp
19
  Install llama.cpp through brew (works on Mac and Linux)
20
 
 
15
  This model was converted to GGUF format from [`ToastyPigeon/Ruby-Music-8B`](https://huggingface.co/ToastyPigeon/Ruby-Music-8B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
16
  Refer to the [original model card](https://huggingface.co/ToastyPigeon/Ruby-Music-8B) for more details on the model.
17
 
18
+ ---
19
+ Note that this model is based on InternLM3, not LLaMA 3.
20
+
21
+
22
+ A roleplaying/creative-writing fine tune of internlm/internlm3-8b-instruct, provided as an alternative to L3 8B for folks with 8GB VRAM.
23
+
24
+
25
+ This was trained on a mix of private instruct (~1k samples) and
26
+ roleplaying (~2.5k human and ~1k synthetic samples), along with the
27
+ following public datasets:
28
+
29
+
30
+ allenai/tulu-3-sft-personas-instruction-following (~500 samples)
31
+ PocketDoc/Dans-Prosemaxx-Gutenberg (all samples)
32
+ ToastyPigeon/SpringDragon-Instruct (~500 samples)
33
+ allura-org/fujin-cleaned-stage-2 (~500 samples)
34
+
35
+
36
+ The instruct format is standard ChatML:
37
+
38
+
39
+ <|im_start|>system
40
+ {system prompt}<|im_end|>
41
+ <|im_start|>user
42
+ {user message}<|im_end|>
43
+ <|im_start|>assistant
44
+ {assistant response}<|im_end|>
45
+
46
+
47
+
48
+
49
+
50
+
51
+
52
+
53
+ Recommended sampler settings:
54
+
55
+
56
+
57
+
58
+ temp 1
59
+ smoothing factor 0.5, smoothing curve 1
60
+ DRY 0.5/1.75/5/1024
61
+
62
+
63
+ There may be better sampler settings, but this at least has proven
64
+ stable in my testing. InternLM3 requires a high amount of tail filtering
65
+ (high min-p, top-a, or something similar) to avoid making strange typos
66
+ and spelling mistakes. Note: this might be a current issue with llama.cpp and the GGUF versions I tested.
67
+
68
+
69
+
70
+
71
+
72
+
73
+
74
+ Notes:
75
+
76
+
77
+
78
+
79
+ I noticed this model has trouble outputting the EOS token sometimes (despite confirming that <|im_end|>
80
+ appears at the end of every turn in the training data). This can cause
81
+ it to ramble at the end of a message instead of ending its turn.
82
+
83
+
84
+ You can either cut the end out of the messages until it picks up the
85
+ right response length, or use logit bias. I've had success getting
86
+ right-sized turns setting logit bias for <|im_end|> to 2.
87
+
88
+ ---
89
  ## Use with llama.cpp
90
  Install llama.cpp through brew (works on Mac and Linux)
91