Quantized Versions of jais-13b-chat

by haouarin - opened Sep 6, 2023

Sep 6, 2023

Hello,

I'm using the "jais-13b-chat" model and find it beneficial. For optimization purposes, could you consider providing 4-bit and 8-bit quantized versions? This would greatly assist deployments in resource-limited environments.

Thanks for considering,
Noureddine

haouarin changed discussion title from Quantized Versions of jais-13b-chat Request to Quantized Versions of jais-13b-chat Sep 6, 2023

Entj

Sep 7, 2023

you can use bitsandbytes directly on jais

akhooli

Sep 8, 2023

There is this quantized version (https://huggingface.co/mouaff25/jais-13b-chat-8bit) but it did not work for me. Model loaded by got tensor mismatch error.

haouarin

Sep 10, 2023

There is this quantized version (https://huggingface.co/mouaff25/jais-13b-chat-8bit) but it did not work for me. Model loaded by got tensor mismatch error.

It works using A100 :
https://colab.research.google.com/drive/1QLihIVHOnWrz5P7XER4mn13YuGAbnPDq?usp=sharing

drakkola

Apr 25

I've just pushed an 8-bit quantized version , feel free to check it 'drakkola/jais-13b-chat-8bit'

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment