|
You can use transformer library and load model for conditional generation and expect those tokens or use monoT5 implementation from BEIR. |
|
|
|
prompt = `Query: {query} Document: {document} Relevant:` |
|
|
|
Model returns tokens if relevant or not: |
|
``` token_false='▁fałsz', token_true='▁prawda'``` |
|
|
|
|
|
MonoT5 implementation is included in BEIR benchmark(https://github.com/beir-cellar/beir): |
|
``` |
|
from beir.reranking.models import MonoT5 |
|
from beir.reranking import Rerank |
|
|
|
queries = YOUR_QUERIES |
|
corpus = YOUR_CORPUS |
|
queries = {query['id'] : query['text'] for query in queries} |
|
corpus = {doc['id']: {'title': doc['title'] , 'text': doc['text']} for doc in corpus} |
|
|
|
|
|
cross_encoder_model = MonoT5(model_path, use_amp=False, token_false='▁fałsz', token_true='▁prawda') |
|
reranker = Rerank(cross_encoder_model, batch_size=100) |
|
|
|
rerank_results = reranker.rerank(corpus, queries, results, top_k=100) |
|
``` |