license: apache-2.0
language:
- ar
pipeline_tag: text-classification
library_name: transformers
base_model:
- Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2
tags:
- reranking
- sentence-transformers
datasets:
- unicamp-dl/mmarco
Namaa-Reranker-v1 ๐โจ
NAMAA-space releases Namaa-Reranker-v1, a high-performance model fine-tuned on unicamp-dl/mmarco to elevate Arabic document retrieval and ranking to new heights! ๐๐ธ๐ฆ
This model is designed to improve search relevance of arabic documents by accurately ranking documents based on their contextual fit for a given query.
Key Features ๐
- Optimized for Arabic: Built on the highly performant Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2 with exclusivly rich Arabic data.
- Advanced Document Ranking: Ranks results with precision, perfect for search engines, recommendation systems, and question-answering applications.
- State-of-the-Art Performance: Achieves excellent performance compared to famous rerankers(See Evaluation), ensuring reliable relevance and precision.
Example Use Cases ๐ผ
- Retrieval Augmented Generation: Improve search result relevance for Arabic content.
- Content Recommendation: Deliver top-tier Arabic content suggestions.
- Question Answering: Boost answer retrieval quality in Arabic-focused systems.
Usage
Within sentence-transformers
The usage becomes easier when you have SentenceTransformers installed. Then, you can use the pre-trained models like this:
from sentence_transformers import CrossEncoder
model = CrossEncoder('NAMAA-Space/Namaa-Reranker-v1', max_length=512)
Query = 'ููู ูู
ูู ุงุณุชุฎุฏุงู
ุงูุชุนูู
ุงูุนู
ูู ูู ู
ุนุงูุฌุฉ ุงูุตูุฑ ุงูุทุจูุฉุ'
Paragraph1 = 'ุงูุชุนูู
ุงูุนู
ูู ูุณุงุนุฏ ูู ุชุญููู ุงูุตูุฑ ุงูุทุจูุฉ ูุชุดุฎูุต ุงูุฃู
ุฑุงุถ'
Paragraph2 = 'ุงูุฐูุงุก ุงูุงุตุทูุงุนู ูุณุชุฎุฏู
ูู ุชุญุณูู ุงูุฅูุชุงุฌูุฉ ูู ุงูุตูุงุนุงุช'
scores = model.predict([(Query, Paragraph1), (Query, Paragraph2)])
Evaluation
We evaluate our model on two different datasets using the metrics MAP, MRR and NDCG@10:
The purpose of this evaluation is to highlight the performance of our model with regards to: Relevant/Irrelevant labels and positive/multiple negatives documents:
Dataset 1: NAMAA-Space/Ar-Reranking-Eval
Dataset 2: NAMAA-Space/Arabic-Reranking-Triplet-5-Eval
As seen, The model performs extremly well in comparison to other famous rerankers.
base model name : Omartificial-Intelligence-Space/Arabic-Triplet-Matryoshka-V2 dataset : Arabic-mmarco-triplet ( 1 million random sample)