Spaces:
Running
on
CPU Upgrade
Running
on
CPU Upgrade
Embeddings model used?
#1
by
DSNlau
- opened
Hello, great job by the way!
May I ask which embeddings model you were using for the documents?
Thanks a lot!
We are using BAAI/bge-m3 which seems to work very well for our main languages in this project, which are Spanish and Catalan. We tried a few other multilingual embeddings but got the best results with BAAI/bge-m3. These embeddings support many languages and are trained specifically to match question-answer pairs. While BAAI don't give any evaluation results specifically for Catalan they do claim that their embeddings work well with low-resource languages, which would suggest that even if Catalan was not strongly represented in their training data one could expect decent results. Our own evaluations confirm this.