teddy-f-47
commited on
Commit
•
5da5d03
1
Parent(s):
6032eb5
Update README.md
Browse files
README.md
CHANGED
@@ -17,7 +17,16 @@ This model is based on [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-
|
|
17 |
|
18 |
## Model description
|
19 |
|
20 |
-
The model was trained for a context length of 1024 tokens.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
21 |
|
22 |
## Intended uses & limitations
|
23 |
|
|
|
17 |
|
18 |
## Model description
|
19 |
|
20 |
+
The model was trained for a context length of 1024 tokens. In addition, while the original model has a hidden size of 2048 (1.3B parameters), this model has a hidden size of 1024 (450.3M parameters).
|
21 |
+
|
22 |
+
The model used for training was as follows:
|
23 |
+
```
|
24 |
+
model_config = AutoConfig.from_pretrained(
|
25 |
+
'microsoft/phi-1_5', vocab_size=len(trained_tokenizer), max_position_embeddings=1024,
|
26 |
+
hidden_size=1024, attn_implementation="flash_attention_2", trust_remote_code=True
|
27 |
+
)
|
28 |
+
model = AutoModelForCausalLM.from_config(model_config, trust_remote_code=True)
|
29 |
+
```
|
30 |
|
31 |
## Intended uses & limitations
|
32 |
|