Triangle104 commited on
Commit
f13a3ac
·
verified ·
1 Parent(s): 7717463

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +77 -0
README.md CHANGED
@@ -14,6 +14,83 @@ library_name: transformers
14
  This model was converted to GGUF format from [`PygmalionAI/Pygmalion-3-12B`](https://huggingface.co/PygmalionAI/Pygmalion-3-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
15
  Refer to the [original model card](https://huggingface.co/PygmalionAI/Pygmalion-3-12B) for more details on the model.
16
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  ## Use with llama.cpp
18
  Install llama.cpp through brew (works on Mac and Linux)
19
 
 
14
  This model was converted to GGUF format from [`PygmalionAI/Pygmalion-3-12B`](https://huggingface.co/PygmalionAI/Pygmalion-3-12B) using llama.cpp via the ggml.ai's [GGUF-my-repo](https://huggingface.co/spaces/ggml-org/gguf-my-repo) space.
15
  Refer to the [original model card](https://huggingface.co/PygmalionAI/Pygmalion-3-12B) for more details on the model.
16
 
17
+ ---
18
+
19
+
20
+
21
+
22
+
23
+
24
+
25
+
26
+ Dataset
27
+ -
28
+
29
+
30
+
31
+ We've gathered a large collection of instructions and roleplaying totaling hundreds of millions of tokens, including our PIPPA dataset and roleplaying forums.
32
+
33
+
34
+
35
+
36
+
37
+
38
+
39
+ Limitations and biases
40
+ -
41
+
42
+
43
+
44
+ The intended use-case for this model is fictional writing for entertainment purposes. Any other sort of usage is out of scope.
45
+
46
+
47
+ As such, it was not fine-tuned to be safe and
48
+ harmless: the base model and this fine-tune have been trained on data
49
+ known to contain profanity and texts that are lewd or otherwise
50
+ offensive. It may produce socially unacceptable or undesirable text,
51
+ even if the prompt itself does not include anything explicitly
52
+ offensive. Outputs might often be factually wrong or misleading.
53
+
54
+
55
+
56
+
57
+
58
+
59
+
60
+ Training Specifications
61
+ -
62
+
63
+
64
+
65
+ We trained our model as a rank-32 LoRA adapter with one epoch over
66
+ our data using 8x NVIDIA A40 GPUs. For this run, we employed a learning
67
+ rate of 2e-4 and a total batch size across all GPUs of 24. A cosine
68
+ learning rate scheduler was used with a 100 step warmup. DeepSpeed ZeRO
69
+ was used to successfully get memory usage down.
70
+
71
+
72
+
73
+
74
+
75
+
76
+
77
+ Acknowledgements
78
+ -
79
+
80
+
81
+
82
+ This project could not have been done without the compute support of Hive Digital Technologies and the Axolotl training software.
83
+
84
+
85
+ We'd like to extensively thank lemonilia for their wonderful help in compiling roleplay forum data.
86
+
87
+
88
+ And most of all, we dedicate this model to our great community,
89
+ who've stuck with us through everything until now. Sincerely, thank you
90
+ so much. We hope you enjoy our work to the fullest and we promise more
91
+ is on the way soon.
92
+
93
+ ---
94
  ## Use with llama.cpp
95
  Install llama.cpp through brew (works on Mac and Linux)
96