Repository with files to perform BM25 searches with FastEmbed.

BM25 (Best Matching 25) is a ranking function used by search engines to estimate the relevance of documents to a given search query.

Usage

Note: This model is supposed to be used with Qdrant. Vectors have to be configured with Modifier.IDF.

Here's an example of BM25 with FastEmbed.

from fastembed import SparseTextEmbedding

documents = [
    "You should stay, study and sprint.",
    "History can only prepare us to be surprised yet again.",
]

model = SparseTextEmbedding(model_name="Qdrant/bm25")
embeddings = list(model.embed(documents))

# [
#     SparseEmbedding(
#         values=array([1.67419738, 1.67419738, 1.67419738, 1.67419738]),
#         indices=array([171321964, 1881538586, 150760872, 1932363795])),
#     SparseEmbedding(values=array(
#         [1.66973021, 1.66973021, 1.66973021, 1.66973021, 1.66973021]),
#                     indices=array([
#                         578407224, 1849833631, 1008800696, 2090661150,
#                         1117393019
#                     ]))
# ]

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.