microsoft
/

Phi-3-mini-4k-instruct-gguf

@@ -12,7 +12,8 @@ tags:
 ## Model Summary
-This repo provides the GGUF format for the Phi-3-Mini-4K-Instruct. The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered websites data with a focus on high-quality and reasoning dense properties.
 The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) it can support.
 The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
 When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
@@ -53,12 +54,15 @@ Nothing contained in this Model Card should be interpreted as or deemed a restri
 ## Chat Format:
-Given the nature of the training data, the Phi-3-Mini-4K-instruct model is best suited for prompts using the chat format as follows. You can provide the prompt as a question:
 ```markdown
 <|user|>
-How to explain Internet for a medieval knight?
-<|end|>
 <|assistant|>
 ```
@@ -66,13 +70,11 @@ where the model generates the text after "<|assistant|>" . In case of few-shots
 ```markdown
 <|user|>
-I am going to Paris, what should I see?
-<|end|>
 <|assistant|>
-Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\nThese are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world."
-<|end|>
 <|user|>
-What is so great about #1?
 <|assistant|>
 ```
@@ -129,7 +131,7 @@ Replace "Your prompt here" with the actual prompt you want to use for generating
 ## How to use with Llamafile:
-Assume that you already have GGUF files downloaded. Here is how you can use the GGUF model with [Llamafile](https://github.com/Mozilla-Ocho/llamafile):
 1. **Download Llamafile-0.7.3**
 ```
@@ -180,7 +182,7 @@ prompt = "How to explain Internet to a medieval knight?"
 # Simple inference example
 output = llm(
-  f"<|user|>\n{prompt}\n<|end|>\n<|assistant|>",
   max_tokens=256,  # Generate up to 256 tokens
   stop=["<|end|>"],
   echo=True,  # Whether to echo the prompt

 ## Model Summary
+This repo provides the GGUF format for the Phi-3-Mini-4K-Instruct.
+The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties.
 The model belongs to the Phi-3 family with the Mini version in two variants [4K](https://huggingface.co/microsoft/Phi-3-mini-4k-instruct) and [128K](https://huggingface.co/microsoft/Phi-3-mini-128k-instruct) which is the context length (in tokens) it can support.
 The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.
 When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.
 ## Chat Format:
+Given the nature of the training data, the Phi-3-Mini-4K-instruct model is best suited for prompts using the chat format as follows.
+You can provide the prompt as a question with a generic template as follow:
+```markdown
+<|user|>\nQuestion <|end|>\n<|assistant|>
+```
+For example:
 ```markdown
 <|user|>
+How to explain Internet for a medieval knight?<|end|>
 <|assistant|>
 ```
 ```markdown
 <|user|>
+I am going to Paris, what should I see?<|end|>
 <|assistant|>
+Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:\n\n1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.\n2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.\n3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.\n\nThese are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world."<|end|>
 <|user|>
+What is so great about #1?<|end|>
 <|assistant|>
 ```
 ## How to use with Llamafile:
+Assuming that you already have GGUF files downloaded. Here is how you can use the GGUF model with [Llamafile](https://github.com/Mozilla-Ocho/llamafile):
 1. **Download Llamafile-0.7.3**
 ```
 # Simple inference example
 output = llm(
+  f"<|user|>\n{prompt}<|end|>\n<|assistant|>",
   max_tokens=256,  # Generate up to 256 tokens
   stop=["<|end|>"],
   echo=True,  # Whether to echo the prompt