bigscience
/

bloom-560m

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Update vocab size

#42

by mathemakitten - opened Nov 28, 2022

base: refs/heads/main

←

from: refs/pr/42

Discussion Files changed

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -191,7 +191,7 @@ The BLOOM tokenizer ([link](https://huggingface.co/bigscience/tokenizer)) is a l
 - A simple pre-tokenization rule, no normalization
-- A vocabulary size of 250,680
 It was trained on a subset of a preliminary version of the corpus using alpha-weighting per language.

 - A simple pre-tokenization rule, no normalization
+- A vocabulary size of 250,880
 It was trained on a subset of a preliminary version of the corpus using alpha-weighting per language.