--- base_model: - autoprogrammer/Llama-3.2-1B-Instruct-MGSM8K-sft1 - autoprogrammer/Llama-3.2-1B-Instruct-medmcqa-zh-linear - Alelcv27/llama3.2-1b-math-code - huyhoangt2201/llama-3.2-1b-sql_finetuned_billingual_3.0_merged - meta-llama/Llama-3.2-1B - meta-llama/Llama-3.2-1B-Instruct library_name: transformers tags: - mergekit - merge --- # merge This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the linear [DARE](https://arxiv.org/abs/2311.03099) merge method using [meta-llama/Llama-3.2-1B](https://huggingface.co/meta-llama/Llama-3.2-1B) as a base. ### Models Merged The following models were included in the merge: * [autoprogrammer/Llama-3.2-1B-Instruct-MGSM8K-sft1](https://huggingface.co/autoprogrammer/Llama-3.2-1B-Instruct-MGSM8K-sft1) * [autoprogrammer/Llama-3.2-1B-Instruct-medmcqa-zh-linear](https://huggingface.co/autoprogrammer/Llama-3.2-1B-Instruct-medmcqa-zh-linear) * [Alelcv27/llama3.2-1b-math-code](https://huggingface.co/Alelcv27/llama3.2-1b-math-code) * [huyhoangt2201/llama-3.2-1b-sql_finetuned_billingual_3.0_merged](https://huggingface.co/huyhoangt2201/llama-3.2-1b-sql_finetuned_billingual_3.0_merged) * [meta-llama/Llama-3.2-1B-Instruct](https://huggingface.co/meta-llama/Llama-3.2-1B-Instruct) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: dare_linear architectures: ["transformer"] base_model: meta-llama/Llama-3.2-1B models: - model: Alelcv27/llama3.2-1b-math-code - model: huyhoangt2201/llama-3.2-1b-sql_finetuned_billingual_3.0_merged - model: autoprogrammer/Llama-3.2-1B-Instruct-MGSM8K-sft1 - model: meta-llama/Llama-3.2-1B-Instruct - model: autoprogrammer/Llama-3.2-1B-Instruct-medmcqa-zh-linear parameters: density: 0.5 weight: 1.0 int8_mask: true ```