|
--- |
|
base_model: |
|
- Qwen/Qwen2.5-7B-Instruct-1M |
|
- Sakalti/SJT-7B-1M |
|
- Triangle104/Q2.5-Instruct-1M_Harmony |
|
- bunnycore/Qwen2.5-7B-RRP-1M |
|
- huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merge |
|
license: mit |
|
--- |
|
# ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M |
|
|
|
**ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M** is a custom merged language model based on **Qwen2.5-7B** with enhanced reasoning, roleplaying, and long-context capabilities. This model supports up to **1 million token** context lengths, making it ideal for ultra-long text processing, deep reasoning tasks, and immersive roleplay interactions. |
|
|
|
Quants are availble in GGUF format, provided by [mradermacher](https://huggingface.co/mradermacher). |
|
1. [GGUF](https://huggingface.co/mradermacher/Qwen2.5-7B-CelestialHarmony-1M-GGUF) |
|
2. [imatrix GGUF](https://huggingface.co/mradermacher/Qwen2.5-7B-CelestialHarmony-1M-i1-GGUF) |
|
--- |
|
|
|
## π§ **Model Details** |
|
- **Base Model**: `Qwen/Qwen2.5-7B-Instruct-1M` |
|
- **Models Used in Merge**: |
|
- `Qwen/Qwen2.5-7B-Instruct-1M` |
|
- `bunnycore/Qwen2.5-7B-RRP-1M` |
|
- `Triangle104/Q2.5-Instruct-1M_Harmony` |
|
- `Sakalti/SJT-7B-1M` |
|
- `huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated` |
|
- **Merge Method**: `MODEL_STOCK` (Optimized layer-wise weight averaging) |
|
|
|
--- |
|
|
|
## π **Overview** |
|
**Qwen2.5-7B-CelestialHarmony-1M** enhances the **Qwen2.5-7B series** with a fine-tuned balance of roleplaying dynamics, structured reasoning, and long-context memory. The model is particularly well-suited for: |
|
- **Roleplaying** π§ββοΈ: Immersive character-based storytelling with deep contextual awareness. |
|
- **Reasoning & Thought Processing** π§ : Capable of structured logical thinking, especially when prompted with `<think>` tags. |
|
- **Ultra-Long Context Handling** π: Efficient processing of sequences up to **1,010,000 tokens** using optimized sparse attention. |
|
|
|
--- |
|
|
|
## βοΈ **Technical Specifications** |
|
| Specification | Value | |
|
|--------------|---------| |
|
| **Model Type** | Causal Language Model | |
|
| **Parameters** | 7.61B | |
|
| **Non-Embedding Parameters** | 6.53B | |
|
| **Layers** | 28 | |
|
| **Attention Heads (GQA)** | 28 (Q), 4 (KV) | |
|
| **Max Context Length** | 1,010,000 tokens | |
|
| **Max Generation Length** | 8,192 tokens | |
|
| **Merge Method** | Model Stock| |
|
|
|
--- |
|
|
|
## π¬ **Merging Details** |
|
This model was merged using the **Model Stock** method, which optimally averages weights from multiple fine-tuned models to create a more efficient, balanced, and performant model. |
|
|
|
### **Merge YAML Configuration** |
|
```yaml |
|
base_model: Qwen/Qwen2.5-7B-Instruct-1M |
|
dtype: bfloat16 |
|
merge_method: model_stock |
|
models: |
|
- model: Qwen/Qwen2.5-7B-Instruct-1M |
|
- model: Triangle104/Q2.5-Instruct-1M_Harmony |
|
- model: Sakalti/SJT-7B-1M |
|
- model: bunnycore/Qwen2.5-7B-RRP-1M |
|
- model: huihui-ai/Qwen2.5-7B-Instruct-1M-abliterated |
|
tokenizer_source: Qwen/Qwen2.5-7B-Instruct-1M |
|
``` |
|
|
|
--- |
|
|
|
## π **Quickstart** |
|
### **Install Required Packages** |
|
Ensure you have the latest `transformers` library installed: |
|
```bash |
|
pip install transformers torch accelerate |
|
``` |
|
|
|
### **Load and Use the Model** |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M" |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
model_name, |
|
torch_dtype="auto", |
|
device_map="auto" |
|
) |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
|
|
prompt = "Tell me a short story about an ancient celestial warrior." |
|
messages = [ |
|
{"role": "system", "content": "You are a wise celestial storyteller."}, |
|
{"role": "user", "content": prompt} |
|
] |
|
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True) |
|
model_inputs = tokenizer([text], return_tensors="pt").to(model.device) |
|
|
|
generated_ids = model.generate(**model_inputs, max_new_tokens=512) |
|
response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
|
|
print(response) |
|
``` |
|
|
|
--- |
|
|
|
## β‘ **Optimized Deployment with vLLM** |
|
For long-context inference, use **vLLM**: |
|
```bash |
|
git clone -b dev/dual-chunk-attn [email protected]:QwenLM/vllm.git |
|
cd vllm |
|
pip install -e . -v |
|
``` |
|
Run the model: |
|
```bash |
|
vllm serve ZeroXClem/Qwen2.5-7B-CelestialHarmony-1M \ |
|
--tensor-parallel-size 4 \ |
|
--max-model-len 1010000 \ |
|
--enable-chunked-prefill --max-num-batched-tokens 131072 \ |
|
--enforce-eager \ |
|
--max-num-seqs 1 |
|
``` |
|
|
|
--- |
|
|
|
## π― **Model Capabilities** |
|
β
**Roleplay & Storytelling** β Designed for engaging interactions. |
|
β
**Long-Context Awareness** β Handles texts up to **1M tokens**. |
|
β
**Logical Thinking & Reasoning** β Supports `<think>` tag to enhance thought structuring. |
|
β
**Optimized Merge Strategy** β Uses `Model Stock` for superior generalization. |
|
|
|
--- |
|
|
|
## π **Acknowledgments** |
|
This model is built on top of **Qwen2.5-7B**, with contributions from **bunnycore, Triangle104, and Sakalti**, leveraging the **Model Stock** merging methodology. |
|
|
|
For further details, see: |
|
- π [Qwen2.5-7B Technical Report](https://arxiv.org/abs/2501.15383) |
|
- π [MergeKit Documentation](https://github.com/mlfoundations/mergekit) |
|
- π [vLLM for Long-Context Inference](https://github.com/QwenLM/vllm) |
|
|
|
--- |