ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M
ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M is a custom merged language model based on Qwen2.5-7B with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to 1 million token context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions.
π§ Model Details
- Base Model:
Qwen/Qwen2.5-7B-Instruct-1M
- Models Used in Merge:
Qwen/Qwen2.5-7B-Instruct-1M
bunnycore/Qwen2.5-7B-RRP-1M
Triangle104/Q2.5-Instruct-1M_Harmony
Sakalti/SJT-7B-1M
huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
- Merge Method:
MODEL_STOCK
(Optimized layer-wise weight averaging)
π Overview
Qwen2.5-7B-CelestialHarmony-1M enhances the Qwen2.5-7B series with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for:
- Roleplaying π§ββοΈ: Immersive character-based storytelling with deep contextual awareness.
- Reasoning & Thought Processing π§ : Capable of structured logical thinking, especially when prompted with
<think>
tags. - Ultra-Long Context Handling π: Efficient processing of sequences up to 1,010,000 tokens using optimized sparse attention.
βοΈ Technical Specifications
Specification | Value |
---|---|
Model Type | Causal Language Model |
Parameters | 7.61B |
Non-Embedding Parameters | 6.53B |
Layers | 28 |
Attention Heads (GQA) | 28 (Q), 4 (KV) |
Max Context Length | 1,010,000 tokens |
Max Generation Length | 8,192 tokens |
Merge Method | Model Stock |
π¬ Merging Details
This model was merged using the Model Stock method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model.
Merge YAML Configuration
base_model: Qwen/Qwen2.5-7B-Instruct-1M
dtype: bfloat16
merge_method: model_stock
models:
- model: Qwen/Qwen2.5-7B-Instruct-1M
- model: Triangle104/Q2.5-Instruct-1M_Harmony
- model: Sakalti/SJT-7B-1M
- model: bunnycore/Qwen2.5-7B-RRP-1M
- model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M
π Quickstart
Install Required Packages
Ensure you have the latest transformers
library installed:
pip install transformers torch accelerate
Load and Use the Model
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M"
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype="auto",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "Tell me a short story about an ancient celestial warrior."
messages = [
{"role": "system", "content": "You are a wise celestial storyteller."},
{"role": "user", "content": prompt}
]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate(**model_inputs, max_new_tokens=512)
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
print(response)
β‘ Optimized Deployment with vLLM
For long-context inference, use vLLM:
git clone -b dev/dual-chunk-attn [email protected]:QwenLM/vllm.git
cd vllm
pip install -e . -v
Run the model:
vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \
--tensor-parallel-size 4 \
--max-model-len 1010000 \
--enable-chunked-prefill --max-num-batched-tokens 131072 \
--enforce-eager \
--max-num-seqs 1
π― Model Capabilities
β
Roleplay & Storytelling β Designed for engaging interactions.
β
Long-Context Awareness β Handles texts up to 1M tokens.
β
Logical Thinking & Reasoning β Supports <think>
tag to enhance thought structuring.
β
Optimized Merge Strategy β Uses Model Stock
for superior generalization.
π Acknowledgments
This model is built on top of Qwen2.5-7B, with contributions from bunnycore, Triangle104, and Sakalti, leveraging the Model Stock merging methodology.
For further details, see:
- Downloads last month
- 12