Original model: https://huggingface.co/Qwen/Qwen2.5-7B-Instruct

Tested on Snapdragon X Elite with LM Studio 0.3.2 ARM64 Technology Preview https://lmstudio.ai/snapdragon

Avg answer Speed: 17 tok/s

LM Studio Settings:

Before System: <|im_start|>system\n
After System: <|im_end|>\n
Before User: <|im_start|>user\n
After User: <|im_end|>\n
Before Assistant: <|im_start|>assistant\n
After Assistant: <|im_end|>\n
Downloads last month
12
GGUF
Model size
7.62B params
Architecture
qwen2

4-bit

Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for pipilok/Qwen2.5-7B-Instruct-Q4_0_4_8-GGUF

Base model

Qwen/Qwen2.5-7B
Quantized
(41)
this model