Update README.md
Browse files
README.md
CHANGED
@@ -19,8 +19,11 @@ The model features a context length of 1024, but in theory, it can be extended i
|
|
19 |
```
|
20 |
rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
|
21 |
rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
|
|
|
22 |
```
|
23 |
|
|
|
|
|
24 |
|
25 |
|
26 |
Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM
|
|
|
19 |
```
|
20 |
rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
|
21 |
rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
|
22 |
+
rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
|
23 |
```
|
24 |
|
25 |
+
I would like to train a 7B parameter model but lack the compute required. If you would like to sponsor some compute, please contact me.
|
26 |
+
|
27 |
|
28 |
|
29 |
Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM
|