wwwaj commited on
Commit
ca5f034
1 Parent(s): f33e280

fix multiple typo in README

Browse files
Files changed (1) hide show
  1. README.md +4 -7
README.md CHANGED
@@ -104,9 +104,7 @@ tokenizer = AutoTokenizer.from_pretrained("microsoft/Phi-3-mini-128k-instruct")
104
  messages = [
105
  {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
106
  {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
107
- {"role": "system", "content": "Sure! Here are some ways to eat bananas and dragonfruits together:"},
108
- {"role": "system", "content": "1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey."},
109
- {"role": "system", "content": "2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
110
  {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
111
  ]
112
 
@@ -130,8 +128,7 @@ print(output[0]['generated_text'])
130
  Note that by default the model use flash attention which requires certain types of GPU to run. If you want to run the model on:
131
 
132
  + V100 or earlier generation GPUs: call `AutoModelForCausalLM.from_pretrained()` with `attn_implementation="eager"`
133
- + CPU: use the **GGUF** quantized models [4K](https://aka.ms/Phi3-mini-4k-instruct-gguf)
134
- + Optimized inference: use the **ONNX** models [4K](https://aka.ms/Phi3-mini-128k-instruct-onnx)
135
 
136
  ## Responsible AI Considerations
137
 
@@ -157,7 +154,7 @@ Developers should apply responsible AI best practices and are responsible for en
157
 
158
  * Architecture: Phi-3 Mini-128K-Instruct has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidlines.
159
  * Inputs: Text. It is best suited for prompts using chat format.
160
- * Context length: 4K tokens
161
  * GPUs: 512 H100-80G
162
  * Training time: 7 days
163
  * Training data: 3.3T tokens
@@ -188,7 +185,7 @@ More specifically, we do not change prompts, pick different few-shot examples, c
188
 
189
  The number of k–shot examples is listed per-benchmark.
190
 
191
- | | Phi-3-Mini-4K-In<br>3.8b | Phi-3-Small<br>7b (preview) | Phi-3-Medium<br>14b (preview) | Phi-2<br>2.7b | Mistral<br>7b | Gemma<br>7b | Llama-3-In<br>8b | Mixtral<br>8x7b | GPT-3.5<br>version 1106 |
192
  |---|---|---|---|---|---|---|---|---|---|
193
  | MMLU <br>5-Shot | 68.8 | 75.3 | 78.2 | 56.3 | 61.7 | 63.6 | 66.0 | 68.4 | 71.4 |
194
  | HellaSwag <br> 5-Shot | 76.7 | 78.7 | 83.2 | 53.6 | 58.5 | 49.8 | 69.5 | 70.4 | 78.8 |
 
104
  messages = [
105
  {"role": "system", "content": "You are a helpful digital assistant. Please provide safe, ethical and accurate information to the user."},
106
  {"role": "user", "content": "Can you provide ways to eat combinations of bananas and dragonfruits?"},
107
+ {"role": "assistant", "content": "Sure! Here are some ways to eat bananas and dragonfruits together: 1. Banana and dragonfruit smoothie: Blend bananas and dragonfruits together with some milk and honey. 2. Banana and dragonfruit salad: Mix sliced bananas and dragonfruits together with some lemon juice and honey."},
 
 
108
  {"role": "user", "content": "What about solving an 2x + 3 = 7 equation?"},
109
  ]
110
 
 
128
  Note that by default the model use flash attention which requires certain types of GPU to run. If you want to run the model on:
129
 
130
  + V100 or earlier generation GPUs: call `AutoModelForCausalLM.from_pretrained()` with `attn_implementation="eager"`
131
+ + Optimized inference: use the **ONNX** models [128K](https://aka.ms/phi3-mini-128k-instruct-onnx)
 
132
 
133
  ## Responsible AI Considerations
134
 
 
154
 
155
  * Architecture: Phi-3 Mini-128K-Instruct has 3.8B parameters and is a dense decoder-only Transformer model. The model is fine-tuned with Supervised fine-tuning (SFT) and Direct Preference Optimization (DPO) to ensure alignment with human preferences and safety guidlines.
156
  * Inputs: Text. It is best suited for prompts using chat format.
157
+ * Context length: 128K tokens
158
  * GPUs: 512 H100-80G
159
  * Training time: 7 days
160
  * Training data: 3.3T tokens
 
185
 
186
  The number of k–shot examples is listed per-benchmark.
187
 
188
+ | | Phi-3-Mini-128K-In<br>3.8b | Phi-3-Small<br>7b (preview) | Phi-3-Medium<br>14b (preview) | Phi-2<br>2.7b | Mistral<br>7b | Gemma<br>7b | Llama-3-In<br>8b | Mixtral<br>8x7b | GPT-3.5<br>version 1106 |
189
  |---|---|---|---|---|---|---|---|---|---|
190
  | MMLU <br>5-Shot | 68.8 | 75.3 | 78.2 | 56.3 | 61.7 | 63.6 | 66.0 | 68.4 | 71.4 |
191
  | HellaSwag <br> 5-Shot | 76.7 | 78.7 | 83.2 | 53.6 | 58.5 | 49.8 | 69.5 | 70.4 | 78.8 |