davda54 commited on
Commit
6b4d81d
·
verified ·
1 Parent(s): 1d902c0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -28,6 +28,6 @@ continuously pretrained on a total of 260 billion subword tokens -- using a mix
28
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios accross different languages:
29
 
30
  | Tokenizer | # tokens | Bokmål | Nynorsk | Sámi | Danish | Swedish | English |
31
- |------------|--------|---------|-------|--------|---------|---------|
32
  | Mistral-Nemo-Base-2407 | 51200| 1.79 | 1.87 | 2.63 | 1.82 | 2.00 | 1.33 |
33
  | NorMistral-11b | 131072 | 1.22 | 1.28 | 1.82 | 1.33 | 1.39 | 1.29 |
 
28
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios accross different languages:
29
 
30
  | Tokenizer | # tokens | Bokmål | Nynorsk | Sámi | Danish | Swedish | English |
31
+ |------------|--------|--------|---------|-------|--------|---------|---------|
32
  | Mistral-Nemo-Base-2407 | 51200| 1.79 | 1.87 | 2.63 | 1.82 | 2.00 | 1.33 |
33
  | NorMistral-11b | 131072 | 1.22 | 1.28 | 1.82 | 1.33 | 1.39 | 1.29 |