pszemraj
/

long-t5-tglobal-xl-16384-book-summary

text2text-generation

Model card Files Files and versions Community

pszemraj commited on Apr 30, 2023

Commit

6d4035b

•

1 Parent(s): ba3e462

Update README.md

Files changed (1) hide show

README.md +7 -10

README.md CHANGED Viewed

@@ -214,18 +214,17 @@ Pass [other parameters related to beam search textgen](https://huggingface.co/bl
 > alternative section title: how to get this monster to run inference on free colab runtimes
-Per [this PR](https://github.com/huggingface/transformers/pull/20341) LLM.int8 is now supported for `long-t5` models. Per **initial tests** the summarization quality seems to hold while using _significantly_ less memory! \*
-How-to: basically make sure you have pip-installed the **latest GitHub repo main** version of `transformers`, and also the `bitsandbytes` package.
-install the latest `main` branch:
 ```bash
-pip install bitsandbytes
-pip install git+https://github.com/huggingface/transformers.git
 ```
-load in 8-bit (_voodoo magic-the good kind-completed by `bitsandbytes` behind the scenes_)
 ```python
 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
@@ -241,9 +240,7 @@ model = AutoModelForSeq2SeqLM.from_pretrained(
 )
 ```
-The above is already present in the Colab demo linked at the top of the model map.
-Do you like to ask questions? Great. But first, check out the [how LLM.int8 works blog post](https://huggingface.co/blog/hf-bitsandbytes-integration) by huggingface.
 \* More rigorous metrics-based research comparing beam-search summarization with and without LLM.int8 will take place over time.

 > alternative section title: how to get this monster to run inference on free colab runtimes
+Via [this PR](https://github.com/huggingface/transformers/pull/20341) LLM.int8 is now supported for `long-t5` models.
+- per **initial tests** the summarization quality seems to hold while using _significantly_ less memory! \*
+- a version of this model quantized to int8 is [already on the hub here](https://huggingface.co/pszemraj/long-t5-tglobal-xl-16384-book-summary-8bit) so if you're using the 8-bit version anyway, you can start there for a 3.5 gb download only!
+First, make sure you have the latest versions of the relevant packages:
 ```bash
+pip install -U transformers bitsandbytes accelerate
 ```
+load in 8-bit (_magic completed by `bitsandbytes` behind the scenes_)
 ```python
 from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
 )
 ```
+The above is already present in the Colab demo linked at the top of the model card.
 \* More rigorous metrics-based research comparing beam-search summarization with and without LLM.int8 will take place over time.