davda54 commited on
Commit
67dcdbf
·
verified ·
1 Parent(s): 5700407

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -32,6 +32,6 @@ continuously pretrained on a total of 260 billion subword tokens -- using a mix
32
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios across different languages:
33
 
34
  | Tokenizer | # tokens | Bokmål | Nynorsk | Sámi | Danish | Swedish |
35
- |:------------|--------|--------|---------|-------|--------|---------|
36
  | Mistral-Nemo-Base-2407 | 131072 | 1.79 | 1.87 | 2.63 | 1.82 | 2.00 |
37
  | NorMistral-11b-warm | 51200 | 1.22 | 1.28 | 1.82 | 1.33 | 1.39 |
 
32
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios across different languages:
33
 
34
  | Tokenizer | # tokens | Bokmål | Nynorsk | Sámi | Danish | Swedish |
35
+ |:------------|:--------:|:--------:|:---------:|:-------:|:--------:|:---------:|
36
  | Mistral-Nemo-Base-2407 | 131072 | 1.79 | 1.87 | 2.63 | 1.82 | 2.00 |
37
  | NorMistral-11b-warm | 51200 | 1.22 | 1.28 | 1.82 | 1.33 | 1.39 |