TinyQwex-4x620M-MoE

TinyQwex-4x620M-MoE is a Mixure of Experts (MoE) made with the following models using LazyMergekit:

🌟 Buying me coffee is a direct way to show support for this project.

πŸ’» Usage

!pip install -qU transformers bitsandbytes accelerate eniops

from transformers import AutoTokenizer
import transformers
import torch

model = "Isotonic/TinyQwex-4x620M-MoE"

tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen1.5-0.5B")
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    model_kwargs={"torch_dtype": torch.bfloat16, "load_in_4bit": True},
)

messages = [{"role": "user", "content": "Explain what a Mixture of Experts is in less than 100 words."}]
prompt = pipeline.tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

🧩 Configuration

experts:
  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "reasoning"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "program"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "storytelling"

  - source_model: Qwen/Qwen1.5-0.5B
    positive_prompts:
    - "Instruction following assistant"
Downloads last month
76
Safetensors
Model size
1.24B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Isotonic/TinyQwex-4x620M-MoE

Quantizations
2 models

Collection including Isotonic/TinyQwex-4x620M-MoE