This model is based on bigscience/bloom-1b7.

We pruned its vocabulary from 250880 to 46145 with Chinese corpus to reduce GPU memory usage. So the total parameter is 1.4b now.

How to use

from transformers import BloomTokenizerFast, BloomForCausalLM

tokenizer = BloomTokenizerFast.from_pretrained('Langboat/bloom-1b4-zh')
model = BloomForCausalLM.from_pretrained('Langboat/bloom-1b4-zh')

print(tokenizer.batch_decode(model.generate(tokenizer.encode('中国的首都是', return_tensors='pt'))))
Downloads last month
3,479
Safetensors
Model size
1.4B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Spaces using Langboat/bloom-1b4-zh 9