adding in header for loading in model, and then the quick start for inference on RDU
Browse files
README.md
CHANGED
@@ -75,7 +75,7 @@ Users should be made aware of the risks, biases, limitations, and restrictions o
|
|
75 |
<details>
|
76 |
<summary>Click to expand</summary>
|
77 |
|
78 |
-
|
79 |
|
80 |
```python
|
81 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
@@ -84,11 +84,11 @@ tokenizer = AutoTokenizer.from_pretrained("sambanovasystems/BLOOMChat-176B-v1")
|
|
84 |
model = AutoModelForCausalLM.from_pretrained("sambanovasystems/BLOOMChat-176B-v1", device_map="auto", torch_dtype="auto")
|
85 |
```
|
86 |
|
87 |
-
###
|
88 |
|
89 |
-
|
90 |
|
91 |
-
|
92 |
|
93 |
[This tutorial](https://github.com/huggingface/transformers-bloom-inference) from Huggingface will be the base layer for running our model. The tutorial is intended for BLOOM; however, since our model is based off of BLOOM we can repurpose it.
|
94 |
|
|
|
75 |
<details>
|
76 |
<summary>Click to expand</summary>
|
77 |
|
78 |
+
### Loading in model with Huggingface
|
79 |
|
80 |
```python
|
81 |
from transformers import AutoModelForCausalLM, AutoTokenizer
|
|
|
84 |
model = AutoModelForCausalLM.from_pretrained("sambanovasystems/BLOOMChat-176B-v1", device_map="auto", torch_dtype="auto")
|
85 |
```
|
86 |
|
87 |
+
### Quick Start Inference on SambaNova's in-house Reconfigurable Dataflow Unit (RDU)
|
88 |
|
89 |
+
The inference code to run the model can be found our [github repo](https://github.com/sambanova/bloomchat/blob/main/rdu_quick_start/inference.py). This code requires the [SambaFlow](https://docs.sambanova.ai/developer/latest/sambaflow-intro.html) SDK to execute. For those interested in running models on RDUs, [please feel free to get in touch](https://sambanova.ai/getstarted).
|
90 |
|
91 |
+
### Quick Start Inference on GPU
|
92 |
|
93 |
[This tutorial](https://github.com/huggingface/transformers-bloom-inference) from Huggingface will be the base layer for running our model. The tutorial is intended for BLOOM; however, since our model is based off of BLOOM we can repurpose it.
|
94 |
|