michaelfeil
commited on
Commit
·
df52dfa
1
Parent(s):
6b344ea
Upload Salesforce/codegen-350M-multi ctranslate fp16 weights
Browse files
README.md
CHANGED
@@ -11,7 +11,7 @@ Speedup inference while reducing memory by 2x-4x using int8 inference in C++ on
|
|
11 |
|
12 |
quantized version of [Salesforce/codegen-350M-multi](https://huggingface.co/Salesforce/codegen-350M-multi)
|
13 |
```bash
|
14 |
-
pip install hf-hub-ctranslate2>=2.0.
|
15 |
```
|
16 |
Converted on 2023-05-21 using
|
17 |
```
|
@@ -33,10 +33,11 @@ model = GeneratorCT2fromHfHub(
|
|
33 |
model_name_or_path=model_name,
|
34 |
device="cuda",
|
35 |
compute_type="int8_float16",
|
36 |
-
tokenizer=AutoTokenizer.from_pretrained("Salesforce/codegen-350M-multi")
|
37 |
)
|
38 |
outputs = model.generate(
|
39 |
text=["def print_hello_world():", "def hello_name(name:"],
|
|
|
40 |
)
|
41 |
print(outputs)
|
42 |
```
|
|
|
11 |
|
12 |
quantized version of [Salesforce/codegen-350M-multi](https://huggingface.co/Salesforce/codegen-350M-multi)
|
13 |
```bash
|
14 |
+
pip install hf-hub-ctranslate2>=2.0.8
|
15 |
```
|
16 |
Converted on 2023-05-21 using
|
17 |
```
|
|
|
33 |
model_name_or_path=model_name,
|
34 |
device="cuda",
|
35 |
compute_type="int8_float16",
|
36 |
+
# tokenizer=AutoTokenizer.from_pretrained("Salesforce/codegen-350M-multi")
|
37 |
)
|
38 |
outputs = model.generate(
|
39 |
text=["def print_hello_world():", "def hello_name(name:"],
|
40 |
+
max_length=64
|
41 |
)
|
42 |
print(outputs)
|
43 |
```
|