Sparse MPT-7B-Chat - DeepSparse

Chat-aligned MPT 7b model pruned to 50% and quantized using SparseGPT for inference with DeepSparse

from deepsparse import TextGeneration
model = TextGeneration(model="hf:neuralmagic/mpt-7b-chat-pruned50-quant")
model("Tell me a joke.", max_new_tokens=50)

Downloads last month: 20

Inference Providers NEW

Text Generation

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

RedHatAI
/

mpt-7b-chat-pruned50-quant-ds

Sparse MPT-7B-Chat - DeepSparse

Space using RedHatAI/mpt-7b-chat-pruned50-quant-ds 1