Qwen2.5-7B-Instruct-kowiki-qa-4bit mlx convert model

Original model is beomi/Qwen2.5-7B-Instruct-kowiki-qa

Requirement

pip install mlx-lm

Usage

mlx_lm.generate --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit --prompt "하늘이 파란 이유가 뭐야?"

In Python

from mlx_lm import load, generate

model, tokenizer = load(
    "mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit",
    tokenizer_config={"trust_remote_code": True},
)

prompt = "하늘이 파란 이유가 뭐야?"

messages = [
    {"role": "system", "content": "당신은 친철한 챗봇입니다."},
    {"role": "user", "content": prompt},
]
prompt = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)

text = generate(
    model,
    tokenizer,
    prompt=prompt,
    # verbose=True,
    # max_tokens=8196,
    # temp=0.0,
)

OpenAI Compitable HTTP Server

mlx_lm.server --model mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit --host 0.0.0.0

import openai


client = openai.OpenAI(
    base_url="http://localhost:8080/v1",
)

prompt = "하늘이 파란 이유가 뭐야?"

messages = [
    {"role": "system", "content": "당신은 친절한 챗봇입니다.",},
    {"role": "user", "content": prompt},
]
res = client.chat.completions.create(
    model='mlx-community/Qwen2.5-7B-Instruct-kowiki-qa-4bit',
    messages=messages,
    temperature=0.2,
)

print(res.choices[0].message.content)

Downloads last month: 8

Safetensors

Model size

1.19B params

Tensor type

F16

U32