English
jono1234 commited on
Commit
9038f21
2 Parent(s): 644cd72 8c20436

Merge branch 'main' of https://huggingface.co/jono1234/RWKV-Decepticon into main

Browse files
Files changed (2) hide show
  1. LICENSE +5 -1
  2. README.md +66 -0
LICENSE CHANGED
@@ -33,4 +33,8 @@ RWKV-Decepticon Large Language Model License
33
 
34
  7. Disclaimer
35
 
36
- The Model is provided “as is”, without warranty of any kind, express or implied.
 
 
 
 
 
33
 
34
  7. Disclaimer
35
 
36
+ The Model is provided “as is”, without warranty of any kind, express or implied.
37
+
38
+ 8. Content Restrictions
39
+
40
+ The Model shall not be used to generate, promote, or facilitate content related to child pornography (CP) or any other form of child exploitation. Using the Model to attempt to generate CP or engage in any activity related to child exploitation is strictly prohibited. Refusal to comply with these restrictions is a violation of this license and may result in termination of the license.
README.md CHANGED
@@ -16,8 +16,74 @@ The name "Decepticon" stems from the model's unique architecture, which combines
16
 
17
  The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning.
18
 
 
 
 
 
 
19
 
 
 
 
 
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
 
22
 
23
  Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM
 
16
 
17
  The model features a context length of 1024, but in theory, it can be extended indefinitely through fine-tuning.
18
 
19
+ ```
20
+ rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
21
+ n_layer: 6
22
+ n_embd: 512
23
+ ctx_len: 1024
24
 
25
+ rwkv-decepticon-books-140m.pth (in progress) - trained on 20gb of books.
26
+ n_layer: 8
27
+ n_embd: 768
28
+ ctx_len: 1024
29
 
30
+ rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
31
+ n_layer: 8
32
+ n_embd: 768
33
+ ctx_len: 1024
34
+
35
+ rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
36
+ n_layer: 8
37
+ n_embd: 768
38
+ ctx_len: 1024
39
+ ```
40
+
41
+ I would like to train a 7B parameter model but lack the compute required. If you would like to sponsor some compute, please contact me.
42
+
43
+
44
+ # License
45
+ ```
46
+ RWKV-Decepticon Large Language Model License
47
+
48
+ 1. Definitions
49
+
50
+ “Model” refers to the RWKV-Decepticon Large Language Model.
51
+ “Individual User” refers to a single person using the Model for personal use.
52
+ “Company” refers to any for-profit organization or entity using the Model for commercial purposes.
53
+ “Non-Profit” refers to any non-profit organization that is not operating under a Company.
54
+ “Research” refers to activities undertaken with the goal of discovering new knowledge or insights, testing theories or ideas, or analyzing data or existing information. It does not include activities where the primary purpose is commercial gain or profit.
55
+ “Open Source” refers to something that is publicly accessible and can be used, modified, and shared by anyone. In the context of this license, it means that any modifications made to the Model must be publicly released and free for others to use, modify, and distribute.
56
+
57
+ 2. Grant of License
58
+
59
+ This license grants Individual Users, Non-Profits, and Research entities the right to use, modify, and distribute the Model in any way they see fit.
60
+ Companies are granted the right to use, modify, and distribute the Model, but not for profit.
61
+
62
+ 3. Conditions
63
+
64
+ Any modifications made to the Model by Companies must be open source and distributed under this same license within one week of modification.
65
+ Individual Users and Non-Profits not operating under a Company are not subject to this condition.
66
+
67
+ 4. Profit Use
68
+
69
+ Profit use by Companies includes any direct or indirect commercial use that could bring in money for the Company or aid them in bringing in money.
70
+
71
+ 5. Research
72
+
73
+ Research entities are allowed to use the Model for research purposes only and not for profit.
74
+
75
+ 6. Violations and Termination
76
+
77
+ Violation of these terms will result in immediate termination of this license.
78
+
79
+ 7. Disclaimer
80
+
81
+ The Model is provided “as is”, without warranty of any kind, express or implied.
82
+
83
+ 8. Content Restrictions
84
+
85
+ The Model shall not be used to generate, promote, or facilitate content related to child pornography (CP) or any other form of child exploitation. Using the Model to attempt to generate CP or engage in any activity related to child exploitation is strictly prohibited. Refusal to comply with these restrictions is a violation of this license and may result in termination of the license.
86
+ ```
87
 
88
 
89
  Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM