|
--- |
|
license: apache-2.0 |
|
--- |
|
# Model Card for Mistral-7B-Instruct-v0.1-8bit |
|
|
|
The Mistral-7B-Instruct-v0.1-8bit is a 8bit quantize version with torch_dtype=torch.float16, I just load in 8bit and push here [Mistral-7B-Instruct-v0.1](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.1). |
|
|
|
For full details of this model please read our [paper](https://arxiv.org/abs/2310.06825) and [release blog post](https://mistral.ai/news/la-plateforme/). |
|
|
|
```python |
|
import torch |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
model_name = "mistralai/Mistral-7B-Instruct-v0.1" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
load_in_8bit=True, |
|
use_flash_attention_2=True, |
|
torch_dtype=torch.float16, |
|
) |
|
|
|
model.push_to_hub("LsTam/Mistral-7B-Instruct-v0.1-8bit") |
|
``` |
|
|
|
To use it: |
|
```python |
|
from transformers import AutoTokenizer, AutoModelForCausalLM |
|
|
|
tok_name = "mistralai/Mistral-7B-Instruct-v0.1" |
|
model_name = "LsTam/Mistral-7B-Instruct-v0.1-8bit" |
|
|
|
tokenizer = AutoTokenizer.from_pretrained(tok_name) |
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
use_flash_attention_2=True, |
|
) |
|
``` |