loic-dagnas-sinequa commited on
Commit
8dc4e23
1 Parent(s): a3db8bd

Update README.md

Browse files



@basilevc


@skirres


I have specified that by Chinese we meant simplified chinese as requested by

@ArianeCavet
here.

I have also reorder the language by the alphabetical order of the language codes,

@ArianeCavet
ok for you?

Just note that zs is not recognized by huggingface language tags.

Files changed (1) hide show
  1. README.md +21 -21
README.md CHANGED
@@ -1,18 +1,18 @@
1
  ---
2
  pipeline_tag: sentence-similarity
3
  tags:
4
- - feature-extraction
5
- - sentence-similarity
6
  language:
7
- - de
8
- - en
9
- - es
10
- - fr
11
- - it
12
- - nl
13
- - ja
14
- - pt
15
- - zh
16
  ---
17
 
18
  # Model Card for `vectorizer.raspberry`
@@ -27,15 +27,15 @@ Model name: `vectorizer.raspberry`
27
 
28
  The model was trained and tested in the following languages:
29
 
30
- - English
31
- - French
32
  - German
 
33
  - Spanish
 
34
  - Italian
35
  - Dutch
36
  - Japanese
37
  - Portuguese
38
- - Chinese
39
 
40
  Besides these languages, basic support can be expected for additional 91 languages that were used during the pretraining
41
  of the base model (see Appendix A of XLM-R paper).
@@ -115,10 +115,10 @@ We evaluated the model on the datasets of the [MIRACL benchmark](https://github.
115
  multilingual capacities. Note that not all training languages are part of the benchmark, so we only report the metrics
116
  for the existing languages.
117
 
118
- | Language | Recall@100 |
119
- |:---------|-----------:|
120
- | French | 0.650 |
121
- | German | 0.528 |
122
- | Spanish | 0.602 |
123
- | Japanese | 0.614 |
124
- | Chinese | 0.680 |
 
1
  ---
2
  pipeline_tag: sentence-similarity
3
  tags:
4
+ - feature-extraction
5
+ - sentence-similarity
6
  language:
7
+ - de
8
+ - en
9
+ - es
10
+ - fr
11
+ - it
12
+ - nl
13
+ - ja
14
+ - pt
15
+ - zs
16
  ---
17
 
18
  # Model Card for `vectorizer.raspberry`
 
27
 
28
  The model was trained and tested in the following languages:
29
 
 
 
30
  - German
31
+ - English
32
  - Spanish
33
+ - French
34
  - Italian
35
  - Dutch
36
  - Japanese
37
  - Portuguese
38
+ - Simplified Chinese
39
 
40
  Besides these languages, basic support can be expected for additional 91 languages that were used during the pretraining
41
  of the base model (see Appendix A of XLM-R paper).
 
115
  multilingual capacities. Note that not all training languages are part of the benchmark, so we only report the metrics
116
  for the existing languages.
117
 
118
+ | Language | Recall@100 |
119
+ |:--------------------|-----------:|
120
+ | German | 0.528 |
121
+ | Spanish | 0.602 |
122
+ | French | 0.650 |
123
+ | Japanese | 0.614 |
124
+ | Simplified Chinese | 0.680 |