The speed of obtaining embeddings on CPU/GPU

by hiauiarau - opened Jul 31, 2024

Jul 31, 2024

Hello, could you please tell me the embedding generation speed on CPU/GPU in FP16/FP32, and how much does it increase compared to BAAI/bge-m3? Also, is it possible to obtain the model in ONNX format?

mpieck

Sep 2, 2024

In fp16 it's 10 times slower than bge-m3.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment