language:
- en
license: apache-2.0
library_name: transformers
tags:
- merge
- mergekit
- lazymergekit
- creative
- roleplay
- instruct
- qwen
- model_stock
- bfloat16
base_model:
- newsbang/Homer-v0.5-Qwen2.5-7B
- allknowingroger/HomerSlerp1-7B
- bunnycore/Qwen2.5-7B-Instruct-Fusion
- bunnycore/Qandora-2.5-7B-Creative
model-index:
- name: Qwen2.5-7B-HomerCreative-Mix
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 78.35
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 36.77
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 32.33
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 6.6
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 13.77
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 38.3
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
name: Open LLM Leaderboard
ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
ZeroXClem/Qwen2.5-7B-HomerCreative-Mix is an advanced language model meticulously crafted by merging four pre-trained models using the powerful mergekit framework. This fusion leverages the Model Stock merge method to combine the creative prowess of Qandora, the instructive capabilities of Qwen-Instruct-Fusion, the sophisticated blending of HomerSlerp1, and the foundational conversational strengths of Homer-v0.5-Qwen2.5-7B. The resulting model excels in creative text generation, contextual understanding, and dynamic conversational interactions.
π Merged Models
This model merge incorporates the following:
bunnycore/Qandora-2.5-7B-Creative: Specializes in creative text generation, enhancing the model's ability to produce imaginative and diverse content.
bunnycore/Qwen2.5-7B-Instruct-Fusion: Focuses on instruction-following capabilities, improving the model's performance in understanding and executing user commands.
allknowingroger/HomerSlerp1-7B: Utilizes spherical linear interpolation (SLERP) to blend model weights smoothly, ensuring a harmonious integration of different model attributes.
newsbang/Homer-v0.5-Qwen2.5-7B: Acts as the foundational conversational model, providing robust language comprehension and generation capabilities.
𧩠Merge Configuration
The configuration below outlines how the models are merged using the Model Stock method. This approach ensures a balanced and effective integration of the unique strengths from each source model.
# Merge configuration for ZeroXClem/Qwen2.5-7B-HomerCreative-Mix using Model Stock
models:
- model: bunnycore/Qandora-2.5-7B-Creative
- model: bunnycore/Qwen2.5-7B-Instruct-Fusion
- model: allknowingroger/HomerSlerp1-7B
merge_method: model_stock
base_model: newsbang/Homer-v0.5-Qwen2.5-7B
normalize: false
int8_mask: true
dtype: bfloat16
Key Parameters
Merge Method (
merge_method
): Utilizes the Model Stock method, as described in Model Stock, to effectively combine multiple models by leveraging their strengths.Models (
models
): Specifies the list of models to be merged:- bunnycore/Qandora-2.5-7B-Creative: Enhances creative text generation.
- bunnycore/Qwen2.5-7B-Instruct-Fusion: Improves instruction-following capabilities.
- allknowingroger/HomerSlerp1-7B: Facilitates smooth blending of model weights using SLERP.
Base Model (
base_model
): Defines the foundational model for the merge, which is newsbang/Homer-v0.5-Qwen2.5-7B in this case.Normalization (
normalize
): Set tofalse
to retain the original scaling of the model weights during the merge.INT8 Mask (
int8_mask
): Enabled (true
) to apply INT8 quantization masking, optimizing the model for efficient inference without significant loss in precision.Data Type (
dtype
): Usesbfloat16
to maintain computational efficiency while ensuring high precision.
π Performance Highlights
Creative Text Generation: Enhanced ability to produce imaginative and diverse content suitable for creative writing, storytelling, and content creation.
Instruction Following: Improved performance in understanding and executing user instructions, making the model more responsive and accurate in task execution.
Optimized Inference: INT8 masking and
bfloat16
data type contribute to efficient computation, enabling faster response times without compromising quality.
π― Use Case & Applications
ZeroXClem/Qwen2.5-7B-HomerCreative-Mix is designed to excel in environments that demand both creative generation and precise instruction following. Ideal applications include:
Creative Writing Assistance: Aiding authors and content creators in generating imaginative narratives, dialogues, and descriptive text.
Interactive Storytelling and Role-Playing: Enhancing dynamic and engaging interactions in role-playing games and interactive storytelling platforms.
Educational Tools and Tutoring Systems: Providing detailed explanations, answering questions, and assisting in educational content creation with contextual understanding.
Technical Support and Customer Service: Offering accurate and contextually relevant responses in technical support scenarios, improving user satisfaction.
Content Generation for Marketing: Creating compelling and diverse marketing copy, social media posts, and promotional material with creative flair.
π Usage
To utilize ZeroXClem/Qwen2.5-7B-HomerCreative-Mix, follow the steps below:
Installation
First, install the necessary libraries:
pip install -qU transformers accelerate
Example Code
Below is an example of how to load and use the model for text generation:
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline
import torch
# Define the model name
model_name = "ZeroXClem/Qwen2.5-7B-HomerCreative-Mix"
# Load the tokenizer
tokenizer = AutoTokenizer.from_pretrained(model_name)
# Load the model
model = AutoModelForCausalLM.from_pretrained(
model_name,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Initialize the pipeline
text_generator = pipeline(
"text-generation",
model=model,
tokenizer=tokenizer,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Define the input prompt
prompt = "Once upon a time in a land far, far away,"
# Generate the output
outputs = text_generator(
prompt,
max_new_tokens=150,
do_sample=True,
temperature=0.7,
top_k=50,
top_p=0.95
)
# Print the generated text
print(outputs[0]["generated_text"])
Notes
Fine-Tuning: This merged model may require fine-tuning to optimize performance for specific applications or domains.
Resource Requirements: Ensure that your environment has sufficient computational resources, especially GPU-enabled hardware, to handle the model efficiently during inference.
Customization: Users can adjust parameters such as
temperature
,top_k
, andtop_p
to control the creativity and diversity of the generated text.
π License
This model is open-sourced under the Apache-2.0 License.
π‘ Tags
merge
mergekit
model_stock
Qwen
Homer
Creative
ZeroXClem/Qwen2.5-7B-HomerCreative-Mix
bunnycore/Qandora-2.5-7B-Creative
bunnycore/Qwen2.5-7B-Instruct-Fusion
allknowingroger/HomerSlerp1-7B
newsbang/Homer-v0.5-Qwen2.5-7B
Open LLM Leaderboard Evaluation Results
Detailed results can be found here
Metric | Value |
---|---|
Avg. | 34.35 |
IFEval (0-Shot) | 78.35 |
BBH (3-Shot) | 36.77 |
MATH Lvl 5 (4-Shot) | 32.33 |
GPQA (0-shot) | 6.60 |
MuSR (0-shot) | 13.77 |
MMLU-PRO (5-shot) | 38.30 |