jono1234
/

RWKV-Decepticon

Model card Files Files and versions Community

jono1234 commited on Oct 4, 2023

Commit

14f879a

•

1 Parent(s): 2c06851

Update README.md

Files changed (1) hide show

README.md +2 -1

README.md CHANGED Viewed

@@ -18,14 +18,15 @@ The model features a context length of 1024, but in theory, it can be extended i
 ```
 rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
-PARAMS:
 n_layer: 6
 n_embd: 512
 ctx_len: 1024
 rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
 n_layer: 8
 n_embd: 768
 ctx_len: 1024
 rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
 n_layer: 8
 n_embd: 768

 ```
 rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
 n_layer: 6
 n_embd: 512
 ctx_len: 1024
 rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
 n_layer: 8
 n_embd: 768
 ctx_len: 1024
 rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
 n_layer: 8
 n_embd: 768