ONNX Models
#82
by
ha1772007
- opened
Can you Plz Provided Static Quantized Version of This model
https://huggingface.co/ha1772007/all-MiniLM-L6-v2-ONNX/tree/main
I tried this but only fp16, qint8 and quint8 are working.
Else can you Plz provide ONNX conversion script if possible
Done! See https://huggingface.co/sentence-transformers/all-MiniLM-L6-v2/tree/main/onnx
Also, this describes the usage: https://sbert.net/docs/sentence_transformer/usage/efficiency.html
@tomaarsen can you please provide a short quick way to create the openvino and quantized into an existing repo?
- given a model: https://huggingface.co/intfloat/multilingual-e5-small
- convert the model into openvino -> quantize
- push to a different hg repo
I'm running into issue.
This is how i do:
- download the model locally via
fast_model = SentenceTransformer('intfloat/multilingual-e5-small', backend="openvino")
- then it will not find the xml file, and will export the model to OpenVINO.
- then quantize and upload the model into my repo fails
export_static_quantized_openvino_model(
fast_model,
quantization_config=None,
model_name_or_path="my-repo/multilingual-e5-small-openvino",
push_to_hub=True,
create_pr=True,
)
I have an Intel CPU with enough memory:
Issue:
[CPU] Add node with name '__module.embeddings/aten::add/Add' Exception from src\plugins\intel_cpu\src\shape_inference\custom\eltwise.cpp:45:
Eltwise shape infer input shapes dim index: 1 mismatch