How exactly do we get scalar quantization and product quantization?

#23

by masdeval - opened Nov 15, 2024

Nov 15, 2024

How do we get an embedding with float16 values?

Is the truncation the way to get the benefit of MRL? So, this means that mxbai-embed-large-v1 was trained to return 1024 dimension embedding but, because it uses MRL, we can safely get only the first 512 values?

ralucab

7 days ago

Hopefully someone will respond to this. I'm asking the same, I'm interested in using a different encoding format for the embeddings like binary or int8 instead of the default float32 but the "feature_extraction" function does not have such a parameter (yet??). Please let us know if this will change or how to use different encoding format when using huggingface_hub for this model.

aamirshakir

Mixedbread org 7 days ago

@ralucab You should check out sentence transformers for this: https://sbert.net/examples/applications/embedding-quantization/README.html

aamirshakir changed discussion status to closed 7 days ago

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment