Model Card for alokabhishek/Mistral-7B-Instruct-v0.2-GGUF
This repo GGUF quantized version of Mistral AI_'s Mistral-7B-Instruct-v0.2 model using llama.cpp.
Model Details
- Model creator: Mistral AI_
- Original model: Mistral-7B-Instruct-v0.2
About GGUF quantization using llama.cpp
- llama.cpp github repo: llama.cpp github repo
How to Get Started with the Model
Use the code below to get started with the model.
How to run from Python code
First install the package
# Base ctransformers with CUDA GPU acceleration
! pip install ctransformers[cuda]>=0.2.24
# Or with no GPU acceleration
# ! pip install ctransformers>=0.2.24
! pip install -U sentence-transformers
! pip install transformers huggingface_hub torch
Import
from ctransformers import AutoModelForCausalLM
from transformers import pipeline, AutoModel, AutoTokenizer
from sentence_transformers import SentenceTransformer
import os
Use a pipeline as a high-level helper
# Load LLM and Tokenizer
model_mistral = AutoModelForCausalLM.from_pretrained(
"alokabhishek/Mistral-7B-Instruct-v0.2-GGUF",
model_file="mistral-7b-instruct-v0.2.Q4_K_M.gguf", # replace Q4_K_M.gguf with Q5_K_M.gguf as needed
model_type="mistral",
gpu_layers=50, # Use `gpu_layers` to specify how many layers will be offloaded to the GPU.
hf=True
)
tokenizer_mistral = AutoTokenizer.from_pretrained(
"alokabhishek/Mistral-7B-Instruct-v0.2-GGUF", use_fast=True
)
# Create a pipeline
pipe_mistral = pipeline(model=model_mistral, tokenizer=tokenizer_mistral, task='text-generation')
prompt_mistral = "Tell me a funny joke about Large Language Models meeting a Blackhole in an intergalactic Bar."
output_mistral = pipe_mistral(prompt_mistral, max_new_tokens=512)
print(output_mistral[0]["generated_text"])
Uses
Direct Use
[More Information Needed]
Downstream Use [optional]
[More Information Needed]
Out-of-Scope Use
[More Information Needed]
Bias, Risks, and Limitations
[More Information Needed]
Evaluation
Results
[More Information Needed]
Model Card Authors [optional]
[More Information Needed]
Model Card Contact
[More Information Needed]
- Downloads last month
- 195
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social
visibility and check back later, or deploy to Inference Endpoints (dedicated)
instead.