Update README.md
Browse files
README.md
CHANGED
@@ -18,14 +18,15 @@ The model features a context length of 1024, but in theory, it can be extended i
|
|
18 |
|
19 |
```
|
20 |
rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
|
21 |
-
PARAMS:
|
22 |
n_layer: 6
|
23 |
n_embd: 512
|
24 |
ctx_len: 1024
|
|
|
25 |
rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
|
26 |
n_layer: 8
|
27 |
n_embd: 768
|
28 |
ctx_len: 1024
|
|
|
29 |
rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
|
30 |
n_layer: 8
|
31 |
n_embd: 768
|
|
|
18 |
|
19 |
```
|
20 |
rwkv-decepticon-char-20m.pth is to be used with vocab.json. This is a character level model.
|
|
|
21 |
n_layer: 6
|
22 |
n_embd: 512
|
23 |
ctx_len: 1024
|
24 |
+
|
25 |
rwkv-decepticon-70m.pth (coming soon) is to be used with 20B_tokenizer.json.
|
26 |
n_layer: 8
|
27 |
n_embd: 768
|
28 |
ctx_len: 1024
|
29 |
+
|
30 |
rwkv-decepticon-170m.pth (coming soon) is trained on a small subset of the SlimPajama dataset (6gb). This also uses the 20B_tokenizer.json file.
|
31 |
n_layer: 8
|
32 |
n_embd: 768
|