Update README.md
Browse files
README.md
CHANGED
@@ -7,4 +7,18 @@ language:
|
|
7 |
library_name: transformers
|
8 |
---
|
9 |
|
10 |
-
GPT-NeoX trained on MiniPile, for a baseline to compare my MANN models against. Uses [NeelNanda/gpt-neox-tokenizer-digits](https://huggingface.co/NeelNanda/gpt-neox-tokenizer-digits) for tokenization.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
7 |
library_name: transformers
|
8 |
---
|
9 |
|
10 |
+
GPT-NeoX trained on MiniPile, for a baseline to compare my MANN models against. Uses [NeelNanda/gpt-neox-tokenizer-digits](https://huggingface.co/NeelNanda/gpt-neox-tokenizer-digits) for tokenization.
|
11 |
+
|
12 |
+
The exact model configuration is as follows:
|
13 |
+
```
|
14 |
+
cfg = GPTNeoXConfig(
|
15 |
+
vocab_size = len(tokenizer),
|
16 |
+
hidden_size = 768,
|
17 |
+
intermediate_size = 768*4,
|
18 |
+
num_hidden_layers = 12,
|
19 |
+
num_attention_heads = 12,
|
20 |
+
tie_word_embeddings = True,
|
21 |
+
hidden_act = "gelu_new",
|
22 |
+
tokenizer = "NeelNanda/gpt-neox-tokenizer-digits"
|
23 |
+
)
|
24 |
+
```
|