Phi-3-128K-Instruct-ov-fp16-int4-asym
Model Description
This is a version of the original Phi-3-128K-Instruct model, converted to OpenVINO™ IR (Intermediate Representation) format for optimized inference on Intel® hardware. This model is created using the procedures detailed in the OpenVINO™ Notebooks repository.
Intended Use
This model is designed for advanced natural language understanding and generation tasks, ideal for developers and researchers in both academic and commercial settings who require efficient AI capabilities for devices with limited computational power. It is not intended for use in creating or promoting harmful or illegal content, in accordance with the guidelines outlined in the Phi-3 Acceptable Use Policy.
Licensing and Redistribution
This model is released under the MIT license.
Weight Compression Parameters
For more information on the parameters, refer to the OpenVINO™ 2024.1.0 documentation
- mode: INT4_ASYM
- group_size: 128
- ratio: 0.8
Running Model Inference
Install packages required for using Optimum Intel integration with the OpenVINO™ backend:
pip install --upgrade --upgrade-strategy eager "optimum[openvino]"
from optimum.intel.openvino import OVModelForCausalLM
from transformers import AutoTokenizer
model_id = "microsoft/Phi-3-128K-Instruct-ov-fp32-int4-asym"
# Initialize the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = OVModelForCausalLM.from_pretrained(model_id)
pipeline = transformers.pipeline("text-generation", model=model, model_kwargs={"torch_dtype": torch.bfloat16}, device_map="auto")
pipeline("i am in paris, plan me a 2 week trip")
- Downloads last month
- 4