answerdotai
/

JaColBERTv2.4

Sentence Similarity

Model card Files Files and versions Community

bclavie commited on Jul 29, 2024

Commit

7637b60

·

verified ·

1 Parent(s): 5d69d93

Create README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+---
+inference: false
+datasets:
+- answerdotai/MMARCO-japanese-32-scored-triplets
+- unicamp-dl/mmarco
+language:
+- ja
+pipeline_tag: sentence-similarity
+tags:
+- ColBERT
+base_model:
+- cl-tohoku/bert-base-japanese-v3
+- bclavie/JaColBERT
+license: mit
+library_name: RAGatouille
+---
+Model weights for the JaColBERTv2.4 checkpoint, which is the pre-post-training version of JaColBERTv2.5, using an entirely overhauled training recipe and trained on just 40% of the data of JaColBERTv2.
+This model largely outperforms all previous approaches, including JaColBERTV2 multilingual models such as BGE-M3, on all datasets.
+This page will be updated with the full details and the model report in the next few days.