libryo-ai
/

BAAI-bge-m3-int8

Feature Extraction

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions Community

BAAI-bge-m3-int8 / README.md

Alex Friedmann

Upload 7 files

7698c0c verified 4 months ago

|

history blame contribute delete

No virus

1.07 kB

	---
	license: mit
	---

	Converted [BAAI/bge-m3](https://huggingface.co/BAAI/bge-m3) model (dense retriever only) in onnx int8 format for use with [Vespa Embedding](https://docs.vespa.ai/en/embedding.html).

	- BAAI-bge-m3_quantized.onnx (int8 quantized)

	The model was quantized using the [optimum](https://github.com/huggingface/optimum) toolkit.

	## Tips: conver to int8 quantized
	```
	# https://github.com/vespa-engine/sample-apps/blob/master/simple-semantic-search/export_hf_model_from_hf.py
	./export_hf_model_from_hf.py --hf_model BAAI/bge-m3 --output_dir bge-m3
	```

	```
	optimum-cli onnxruntime quantize --onnx_model ./bge-m3 -o bge-m3-large_quantized --avx512_vnni
	```

	## License
	The license for this model is based on the original license (found in the LICENSE file in the project's root directory), which is the MIT License.
	- https://huggingface.co/BAAI/bge-m3

	## Attribution
	All credits for this model go to the authors of BAAI/bge-m3 and the associated researchers and organizations. When using this model, please be sure to attribute the original authors.