davda54 commited on
Commit
a695f07
·
verified ·
1 Parent(s): 3f7f0b9

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -23,6 +23,10 @@ continuously pretrained on a total of 260 billion subword tokens -- using a mix
23
  *Disclaimer: This model is pretrained on raw (mostly web-based) textual data. It is not finetuned to follow instructions, and it can generate harmful completions after inappropriate user prompts. It is primarily intended for research purposes.*
24
 
25
 
 
 
 
 
26
  ## Tokenizer
27
 
28
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios across different languages:
 
23
  *Disclaimer: This model is pretrained on raw (mostly web-based) textual data. It is not finetuned to follow instructions, and it can generate harmful completions after inappropriate user prompts. It is primarily intended for research purposes.*
24
 
25
 
26
+ ## License
27
+
28
+ *Here, we should probably discuss our understanding of the license*
29
+
30
  ## Tokenizer
31
 
32
  This model uses a new tokenizer, specially trained on the target languages. Therefore it offers substantially faster inference than the original Mistral-Nemo-Base-2407 model. Here are the subword-to-word split ratios across different languages: