norallm
/

normistral-7b-warm

Text Generation

Norwegian Bokmål

Norwegian Nynorsk

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

davda54 commited on Feb 9

Commit

90f7ec2

•

1 Parent(s): a56f3c7

Update README.md

Files changed (1) hide show

README.md +12 -9

README.md CHANGED Viewed

@@ -319,10 +319,10 @@ generate("I'm super excited about this Norwegian NORA model! Can it translate th
 ```
 _____
-## Example usage with low GPU usage
 Install bitsandbytes if you want to load in 8bit
-```python
 pip install bitsandbytes
 pip install accelerate
 ```
@@ -334,13 +334,17 @@ import torch
 # First, we will have to import the tokenizer and the language model
 tokenizer = AutoTokenizer.from_pretrained("norallm/normistral-7b-warm")
-model = AutoModelForCausalLM.from_pretrained("norallm/normistral-7b-warm",
- device_map='auto',
- load_in_8bit=True,
- torch_dtype=torch.float16)
 # This setup needs about 8gb VRAM
-# Setting load_in_8bit = False, 15gb VRAM
-# Using torch.float32 and load_in_8bit = False, 21gb VRAM
 # Now we will define the zero-shot prompt template
@@ -362,5 +366,4 @@ def generate(text):
 # Now you can simply call the generate function with an English text you want to translate:
 generate("I'm super excited about this Norwegian NORA model! Can it translate these sentences?")
-# > this should output: 'Jeg er super spent på denne norske NORA modellen! Kan den oversette disse setningene?'
 ```

 ```
 _____
+## Example usage on a GPU with ~16GB VRAM
 Install bitsandbytes if you want to load in 8bit
+```bash
 pip install bitsandbytes
 pip install accelerate
 ```
 # First, we will have to import the tokenizer and the language model
 tokenizer = AutoTokenizer.from_pretrained("norallm/normistral-7b-warm")
 # This setup needs about 8gb VRAM
+# Setting `load_in_8bit=False` -> 15gb VRAM
+# Using `torch.float32` and `load_in_8bit=False` -> 21gb VRAM
+model = AutoModelForCausalLM.from_pretrained(
+ "norallm/normistral-7b-warm",
+ device_map='auto',
+ load_in_8bit=True,
+ torch_dtype=torch.bfloat16
+)
 # Now we will define the zero-shot prompt template
 # Now you can simply call the generate function with an English text you want to translate:
 generate("I'm super excited about this Norwegian NORA model! Can it translate these sentences?")
 ```