norallm
/

normistral-11b-warm

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

davda54 commited on Sep 26, 2024

Commit

6b4d81d

·

verified ·

1 Parent(s): 1d902c0

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -28,6 +28,6 @@ continuously pretrained on a total of 260 billion subword tokens -- using a mix
 This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios accross different languages:
 | Tokenizer  | # tokens | Bokmål | Nynorsk | Sámi  | Danish | Swedish | English |
-|------------|--------|---------|-------|--------|---------|---------|
 | Mistral-Nemo-Base-2407    | 51200| 1.79   | 1.87    | 2.63  | 1.82   | 2.00    | 1.33    |
 | NorMistral-11b | 131072 | 1.22   | 1.28    | 1.82  | 1.33   | 1.39    | 1.29    |

 This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios accross different languages:
 | Tokenizer  | # tokens | Bokmål | Nynorsk | Sámi  | Danish | Swedish | English |
+|------------|--------|--------|---------|-------|--------|---------|---------|
 | Mistral-Nemo-Base-2407    | 51200| 1.79   | 1.87    | 2.63  | 1.82   | 2.00    | 1.33    |
 | NorMistral-11b | 131072 | 1.22   | 1.28    | 1.82  | 1.33   | 1.39    | 1.29    |