remove "run on CPU" from documentation
#35
by
eitanturok
- opened
No description provided.
(I agree with this FWIW)
The documentation should state the hardware requirements to run the model: https://huggingface.co/databricks/dbrx-instruct/discussions/28
Currently the example provides the following text:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 61/61 [00:04<00:00, 12.35it/s]
Setting `pad_token_id` to `eos_token_id`:100257 for open-end generation.
and then hangs. There does not seem to be any process or mem caching activities.
If you setoutputs = model.generate(**input_ids, max_new_tokens=200)
tooutputs = model.generate(**input_ids, max_new_tokens=200, verbose=True)
Then:
Special tokens have been added in the vocabulary, make sure the associated word embeddings are fine-tuned or trained.
Loading checkpoint shards: 100%|βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ| 61/61 [00:06<00:00, 9.90it/s]
Traceback (most recent call last):
File "/Volumes/python/llm_dbrx-instruct.py", line 113, in <module>
main(isInstruct=True)
File "/Volumes/python/llm_dbrx-instruct.py", line 88, in main
outputs = model.generate(**input_ids, max_new_tokens=200, verbose=True)
File "/Users/user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/Users/user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/transformers/generation/utils.py", line 1325, in generate
self._validate_model_kwargs(model_kwargs.copy())
File "/Users/user/.pyenv/versions/3.9.6/lib/python3.9/site-packages/transformers/generation/utils.py", line 1121, in _validate_model_kwargs
raise ValueError(
ValueError: The following `model_kwargs` are not used by the model: ['verbose'] (note: typos in the generate arguments will also show up in this list)
I am learning, and it was not immediately apparent to me that 264GB RAM was required to run a model.
hanlintang
changed pull request status to
merged