teddy-f-47
/

phi-pl-400M-v_0_1

Text Generation

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Metrics Training metrics Community

teddy-f-47 commited on Jan 8

Commit

5da5d03

•

1 Parent(s): 6032eb5

Update README.md

Files changed (1) hide show

README.md +10 -1

README.md CHANGED Viewed

@@ -17,7 +17,16 @@ This model is based on [microsoft/phi-1_5](https://huggingface.co/microsoft/phi-
 ## Model description
-The model was trained for a context length of 1024 tokens.
 ## Intended uses & limitations

 ## Model description
+The model was trained for a context length of 1024 tokens. In addition, while the original model has a hidden size of 2048 (1.3B parameters), this model has a hidden size of 1024 (450.3M parameters).
+The model used for training was as follows:
+```
+model_config = AutoConfig.from_pretrained(
+    'microsoft/phi-1_5', vocab_size=len(trained_tokenizer), max_position_embeddings=1024,
+    hidden_size=1024, attn_implementation="flash_attention_2", trust_remote_code=True
+)
+model = AutoModelForCausalLM.from_config(model_config, trust_remote_code=True)
+```
 ## Intended uses & limitations