bclavie
/

JaColBERT

@@ -50,7 +50,7 @@ Moreover, this approach requires **considerably less data than dense embeddings*
 ### Training Data
-The model is trained on the japanese split of MMARCO, augmented with hard negatives. [The data, including the hard negatives, is available on huggingface datasets](bclavie/mmarco-japanese-hard-negatives).
 We do not train nor perform data augmentation on any other dataset at this stage. We hope to do so in future work, or support practitioners intending to do so (feel free to [reach out](mailto:[email protected])).

 ### Training Data
+The model is trained on the japanese split of MMARCO, augmented with hard negatives. [The data, including the hard negatives, is available on huggingface datasets](https://huggingface.co/datasets/bclavie/mmarco-japanese-hard-negatives).
 We do not train nor perform data augmentation on any other dataset at this stage. We hope to do so in future work, or support practitioners intending to do so (feel free to [reach out](mailto:[email protected])).