File size: 752 Bytes
01dc3fb 8509d32 01dc3fb 8509d32 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
---
library_name: transformers
base_model:
- HuggingFaceM4/Idefics3-8B-Llama3
pipeline_tag: image-text-to-text
---
# Idefics3-8B-Llama3-bnb_nf4
BitsAndBytes NF4 quantization.
### Quantization
Quantization created with:
``` python
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
model_id = "HuggingFaceM4/Idefics3-8B-Llama3"
nf4_config = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_quant_type="nf4",
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_use_double_quant=True,
llm_int8_enable_fp32_cpu_offload=True,
llm_int8_skip_modules=["lm_head", "model.vision_model", "model.connector"],
)
model_nf4 = AutoModelForVision2Seq.from_pretrained(model_id, quantization_config=nf4_config)
``` |