add 8bit deets
Browse files
README.md
CHANGED
@@ -127,6 +127,10 @@ model-index:
|
|
127 |
|
128 |
# long-t5-tglobal-xl + BookSum
|
129 |
|
|
|
|
|
|
|
|
|
130 |
Summarize long text and get a SparkNotes-esque summary of arbitrary topics!
|
131 |
- Generalizes reasonably well to academic & narrative text.
|
132 |
- This is the XL checkpoint, which **from a human-evaluation perspective, [produces even better summaries](https://long-t5-xl-book-summary-examples.netlify.app/)**.
|
@@ -151,11 +155,9 @@ Read the paper by Guo et al. here: [LongT5: Efficient Text-To-Text Transformer f
|
|
151 |
|
152 |
## How-To in Python
|
153 |
|
154 |
-
|
155 |
-
|
156 |
-
Install/update transformers `pip install -U transformers`
|
157 |
|
158 |
-
|
159 |
|
160 |
```python
|
161 |
import torch
|
@@ -179,18 +181,33 @@ Pass [other parameters related to beam search textgen](https://huggingface.co/bl
|
|
179 |
|
180 |
### LLM.int8 Quantization
|
181 |
|
182 |
-
|
|
|
|
|
183 |
|
184 |
-
How-to: essentially ensure you have pip installed from the **latest GitHub repo main** version of `transformers`, and
|
185 |
|
186 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
187 |
```python
|
188 |
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
189 |
|
190 |
-
tokenizer = AutoTokenizer.from_pretrained(
|
|
|
|
|
191 |
|
192 |
model = AutoModelForSeq2SeqLM.from_pretrained(
|
193 |
-
|
|
|
|
|
194 |
)
|
195 |
```
|
196 |
|
|
|
127 |
|
128 |
# long-t5-tglobal-xl + BookSum
|
129 |
|
130 |
+
<a href="https://colab.research.google.com/gist/pszemraj/c19e32baf876deb866c31cd46c86e893/long-t5-xl-accelerate-test.ipynb">
|
131 |
+
<img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
|
132 |
+
</a>
|
133 |
+
|
134 |
Summarize long text and get a SparkNotes-esque summary of arbitrary topics!
|
135 |
- Generalizes reasonably well to academic & narrative text.
|
136 |
- This is the XL checkpoint, which **from a human-evaluation perspective, [produces even better summaries](https://long-t5-xl-book-summary-examples.netlify.app/)**.
|
|
|
155 |
|
156 |
## How-To in Python
|
157 |
|
158 |
+
install/update transformers `pip install -U transformers`
|
|
|
|
|
159 |
|
160 |
+
summarize text with pipeline:
|
161 |
|
162 |
```python
|
163 |
import torch
|
|
|
181 |
|
182 |
### LLM.int8 Quantization
|
183 |
|
184 |
+
> alternate section title: how to get this monster to run inference on free Colab runtimes
|
185 |
+
|
186 |
+
Per [this PR](https://github.com/huggingface/transformers/pull/20341) LLM.int8 is now supported for `long-t5` models. Per **initial testing** summarization quality appears to hold while requiring _significantly_ less memory! \*
|
187 |
|
188 |
+
How-to: essentially ensure you have pip installed from the **latest GitHub repo main** version of `transformers`, and `bitsandbytes`
|
189 |
|
190 |
|
191 |
+
install the latest `main` branch:
|
192 |
+
|
193 |
+
```bash
|
194 |
+
pip install bitsandbytes
|
195 |
+
pip install git+https://github.com/huggingface/transformers.git
|
196 |
+
```
|
197 |
+
|
198 |
+
load in 8-bit (_voodoo magic-the good kind-completed by `bitsandbytes` behind the scenes_)
|
199 |
+
|
200 |
```python
|
201 |
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
|
202 |
|
203 |
+
tokenizer = AutoTokenizer.from_pretrained(
|
204 |
+
"pszemraj/long-t5-tglobal-xl-16384-book-summary"
|
205 |
+
)
|
206 |
|
207 |
model = AutoModelForSeq2SeqLM.from_pretrained(
|
208 |
+
"pszemraj/long-t5-tglobal-xl-16384-book-summary",
|
209 |
+
load_in_8bit=True,
|
210 |
+
device_map="auto",
|
211 |
)
|
212 |
```
|
213 |
|