prithivida
/

miniDense_telugu_v1

Sentence Similarity

sentence-transformers

feature-extraction

passage-retrieval

knowledge-distillation

middle-training

text-embeddings-inference

Inference Endpoints

Model card Files Files and versions

prithivida commited on Jun 2

Commit

6e7e914

•

1 Parent(s): 92abd3f

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -144,7 +144,7 @@ for query, query_embedding in zip(queries, query_embeddings):
 # FAQS
 #### How can I reduce overall inference cost ?
-- You can use ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
 #### How do I reduce vector storage cost ?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)

 # FAQS
 #### How can I reduce overall inference cost ?
+- You can host these models without heavy torch dependency using the ONNX flavours of these models via [FlashRetrieve](https://github.com/PrithivirajDamodaran/FlashRetrieve) library.
 #### How do I reduce vector storage cost ?
 [Use Binary and Scalar Quantisation](https://huggingface.co/blog/embedding-quantization)