ONNX Models

#82
by ha1772007 - opened

Can you Plz Provided Static Quantized Version of This model

https://huggingface.co/ha1772007/all-MiniLM-L6-v2-ONNX/tree/main

I tried this but only fp16, qint8 and quint8 are working.

Else can you Plz provide ONNX conversion script if possible

@tomaarsen can you please provide a short quick way to create the openvino and quantized into an existing repo?

I'm running into issue.
This is how i do:

  • download the model locally via
fast_model = SentenceTransformer('intfloat/multilingual-e5-small', backend="openvino")
  • then it will not find the xml file, and will export the model to OpenVINO.
  • then quantize and upload the model into my repo fails
export_static_quantized_openvino_model(
    fast_model,
    quantization_config=None,
    model_name_or_path="my-repo/multilingual-e5-small-openvino",
    push_to_hub=True,
    create_pr=True,
)

I have an Intel CPU with enough memory:
Issue:

[CPU] Add node with name '__module.embeddings/aten::add/Add' Exception from src\plugins\intel_cpu\src\shape_inference\custom\eltwise.cpp:45:
Eltwise shape infer input shapes dim index: 1 mismatch

Sign up or log in to comment