Text Generation
Transformers
PyTorch
TensorBoard
Safetensors
bloom
Eval Results
text-generation-inference
Inference Endpoints

Why can't the model be run (really slowly) on consumer hardware?

#61
by TonoTheHero - opened

Hi! I'm curious as to why this isn't possible!
Just keep the stuff on an SSD and split the work 400/8 on a consumer GPU?

If you are not concerned about inference times you don't even need GPUs, it runs fine on CPUs given you have enough system RAM. Much discussion has been had on performance on different hardware configurations. See:

https://huggingface.co/bigscience/bloom/discussions/45
https://huggingface.co/bigscience/bloom/discussions/59
https://huggingface.co/bigscience/bloom/discussions/58

In case it helps, I wrote a blog post that shows how to run BLOOM (the largest 176B version) on a desktop computer, even if you don’t have a GPU. In my computer (i5 11gen, 16GB RAM, 1TB SSD Samsung 980 pro), the generation takes 3 minutes per token using only the CPU, which is a little slow but manageable. See the blog post link below.

https://towardsdatascience.com/run-bloom-the-largest-open-access-ai-model-on-your-desktop-computer-f48e1e2a9a32

BigScience Workshop org

really cool article @arteagac thanks for sharing! <3

This comment has been hidden

In case it helps, I wrote a blog post that shows how to run BLOOM (the largest 176B version) on a desktop computer, even if you don’t have a GPU. In my computer (i5 11gen, 16GB RAM, 1TB SSD Samsung 980 pro), the generation takes 3 minutes per token using only the CPU, which is a little slow but manageable. See the blog post link below.

https://towardsdatascience.com/run-bloom-the-largest-open-access-ai-model-on-your-desktop-computer-f48e1e2a9a32

Hello!
I follow the guide but the tokenizer.json downloaded file is invalid, in fact not even a json file. I found it somewhere in internet but now when I do
final_lnorm.load_state_dict(get_state_dict(shard_num=72, prefix="ln_f."))
it says
File "/home/usuari/anaconda3/lib/python3.9/site-packages/torch/serialization.py", line 920, in _legacy_load
magic_number = pickle_module.load(f, **pickle_load_args)
_pickle.UnpicklingError: invalid load key, 'v'.

Do you know how to solve it?
Thanks!

Hi @cdani , I suspect the files were not properly downloaded and some files might be pointers instead of the actual files. The easiest way to fix this is to download the entire repo from scratch using git lfs as follows:

git lfs install
git clone https://huggingface.co/bigscience/bloom

This will download the entire repo (including some repo history). When the download is complete, make sure the size of the files matches the size in the web repository.

BigScience Workshop org

Closing due to lack of activity. Feel free to re-open if you feel that the discussion is not finished yet.

TimeRobber changed discussion status to closed

Sign up or log in to comment