This is a quantized version of the Jais-13b model

If you are using text-generator-webui Select Transformers

  • Compute d-type: bfloat16
  • Quantization Type : nf4
  • Load in 4-bit: True
  • Use double quantization: True
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
import transformers
import torch

model_name = "jwnder/core42_jais-13b-bnb-4bit"

import warnings
warnings.filterwarnings('ignore')

tokenizer = AutoTokenizer.from_pretrained(model_input_folder, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(model_input_folder, trust_remote_code=True)

inputs = tokenizer("Testing LLM!", return_tensors="pt")
start = datetime.now()
outputs = model.generate(**inputs)
end = datetime.now()
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
Downloads last month
16
Safetensors
Model size
6.93B params
Tensor type
F32
FP16
U8
Inference Examples
Inference API (serverless) does not yet support model repos that contain custom code.