sdadas
/

polish-reranker-bge-v2

Text Classification

information-retrieval

Inference Endpoints

Model card Files Files and versions Community

sdadas commited on Sep 28, 2024

Commit

d64e2b3

·

verified ·

1 Parent(s): 62f1366

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -16,7 +16,7 @@ This is a reranker for Polish based on [BAAI/bge-reranker-v2-m3](https://hugging
 - After the training, we merged the original and fine-tuned weights to create the final checkpoint
 - We used a custom implementation of XLM-RoBERTa with support for Flash Attention 2. If you want to use these features, load the model with the arguments `trust_remote_code=True` and `attn_implementation="flash_attention_2"`. This is especially important for this model, since [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) supports long contexts of 8192 tokens. For such input length, the inference can be up to 400% faster with Flash Attention in comparison to the original model.
-In most cases, the use of [sdadas/polish-reranker-roberta-v2](https://huggingface.co/sdadas/polish-reranker-roberta-v2) is preferred to this model as it achieves better results for Polish. The main advantage of this model is its context length, so it can perform better on datasets with long documents.
 ## Usage (Huggingface Transformers)

 - After the training, we merged the original and fine-tuned weights to create the final checkpoint
 - We used a custom implementation of XLM-RoBERTa with support for Flash Attention 2. If you want to use these features, load the model with the arguments `trust_remote_code=True` and `attn_implementation="flash_attention_2"`. This is especially important for this model, since [BAAI/bge-reranker-v2-m3](https://huggingface.co/BAAI/bge-reranker-v2-m3) supports long contexts of 8192 tokens. For such input length, the inference can be up to 400% faster with Flash Attention in comparison to the original model.
+In most cases, the use of [sdadas/polish-reranker-roberta-v2](https://huggingface.co/sdadas/polish-reranker-roberta-v2) is preferred to this model as it achieves better results for Polish. The main advantage of this model is its context length, so it may perform better on some datasets with long documents.
 ## Usage (Huggingface Transformers)