jono1234
/

RWKV-Decepticon

Model card Files Files and versions Community

jono1234 commited on Oct 4, 2023

Commit

a2ccb6e

•

1 Parent(s): a55ef31

Update README.md

Files changed (1) hide show

README.md +3 -0

README.md CHANGED Viewed

@@ -19,8 +19,11 @@ The model features a context length of 1024, but in theory, it can be extended i
 ```
 rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
 rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
 ```
 Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM

 ```
 rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
 rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
+rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
 ```
+I would like to train a 7B parameter model but lack the compute required. If you would like to sponsor some compute, please contact me.
 Thank you to the creators of RWKV who made all of this possible. Their repo is here: https://github.com/BlinkDL/RWKV-LM