Bloom CTranslate2's model

This is a collection of some of the Bigscience Bloom exported to CTranslate2 model format. This allows to load and usage these models efficently on CPU or GPU.

Models

The models have been converted to float16 and can be load in with any other quantification method (e.g. int 8).

Model name Description
bloom-560m 560M parameter model pretrained on ROOTS
bloom-3b 3B parameter model pretrained on ROOTS
bloomz-7b1 7.1B parameter model finetuned on xP3
bloomz-7b1-mt 7.1B parameter model finetuned on xP3mt
mt0-xxl-mt 13B parameter model finetuned on xP3

See directories for the different models available.

Simple code to use them

Install dependencies:

pip install huggingface_hub ctranslate2 transformers torch

Usage:

import huggingface_hub
import ctranslate2
import transformers

model_name = "bloomz-7b1"
prompt = "Hello, I am Joan and I am from Barcelona and"

repo_id = "jordimas/bloom-ctranslate2"

snapshot_folder = huggingface_hub.snapshot_download(repo_id = repo_id, allow_patterns=f"*{model_name}*")
print(f"folder: {snapshot_folder}")

model = f"{snapshot_folder}/{model_name}"
generator = ctranslate2.Generator(model, compute_type="int8")
tokenizer = transformers.AutoTokenizer.from_pretrained(model)

start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
results = generator.generate_batch([start_tokens], max_length=90)
result = tokenizer.decode(results[0].sequences_ids[0])
print(f"Result: {result}")
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.