--- base_model: - CultriX/SeQwence-14B - VAGOsolutions/SauerkrautLM-v2-14b-DPO - v000000/Qwen2.5-Lumen-14B - CultriX/Qwen2.5-14B-Wernicke - Qwen/Qwen2.5-14B - CultriX/Qwen2.5-14B-MegaMerge-pt2 library_name: transformers tags: - mergekit - merge license: apache-2.0 language: - en metrics: - accuracy pipeline_tag: text-generation --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [DARE](https://arxiv.org/abs/2311.03099) [TIES](https://arxiv.org/abs/2306.01708) merge method using [Qwen/Qwen2.5-14B](https://huggingface.co/Qwen/Qwen2.5-14B) as a base. ### Models Merged The following models were included in the merge: * [CultriX/SeQwence-14B](https://huggingface.co/CultriX/SeQwence-14B) * [VAGOsolutions/SauerkrautLM-v2-14b-DPO](https://huggingface.co/VAGOsolutions/SauerkrautLM-v2-14b-DPO) * [v000000/Qwen2.5-Lumen-14B](https://huggingface.co/v000000/Qwen2.5-Lumen-14B) * [CultriX/Qwen2.5-14B-Wernicke](https://huggingface.co/CultriX/Qwen2.5-14B-Wernicke) * [CultriX/Qwen2.5-14B-MegaMerge-pt2](https://huggingface.co/CultriX/Qwen2.5-14B-MegaMerge-pt2) ### Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: CultriX/Qwen2.5-14B-Wernicke parameters: weight: 0.35 # Strong performance in GPQA, MUSR, and MMLU-PRO density: 0.6 # Retain 60% of significant parameters - model: VAGOsolutions/SauerkrautLM-v2-14b-DPO parameters: weight: 0.30 # Exceptional IFEval and MATH Level 5 capabilities density: 0.6 # Retain 60% of significant parameters - model: CultriX/Qwen2.5-14B-MegaMerge-pt2 parameters: weight: 0.20 # Balanced contributions to Truthful QA and MMLU density: 0.5 # Retain 50% of significant parameters - model: CultriX/SeQwence-14B parameters: weight: 0.15 # Provides diverse data and generalization density: 0.4 # Retain 40% of significant parameters - model: v000000/Qwen2.5-Lumen-14B parameters: weight: 0.10 # Enhances creative and narrative tasks density: 0.5 # Retain 50% for task diversity base_model: Qwen/Qwen2.5-14B merge_method: dare_ties parameters: normalize: true # Ensures parameter scaling compatibility int8_mask: true # Optimizes memory and computational efficiency dtype: bfloat16 tokenizer_source: Qwen/Qwen2.5-14B-Instruct ```