--- language: en tags: - llama - text-generation - model-merging license: mit base_model: - meta-llama/Meta-Llama-3-8B library_name: transformers --- # llama-3-8b-merged-linear ## Overview This model represents a linear merge of three distinct Llama 3-8b models using the Mergekit tool. The primary goal of this merge is to leverage the unique strengths of each base model, such as multilingual capabilities and specialized domain knowledge, into a more versatile and generalized language model. By merging these models linearly, we combine their expertise into a unified model that performs well across various tasks, such as text generation, multilingual understanding, and domain-specific tasks. ## Model Details ### Model Description - **Models Used**: - Danielbrdz/Barcenas-Llama3-8b-ORPO - DeepMount00/Llama-3-8b-Ita - lightblue/suzume-llama-3-8B-multilingual - **Merging Tool**: Mergekit - **Merge Method**: Linear merge with equal weighting (1.0) for all models - **Tokenizer Source**: Union - **Data Type**: float16 (FP16) precision - **License**: MIT License - **Languages Supported**: Multilingual, including English, Italian, and potentially others from the multilingual base models ## Configuration The following YAML configuration was used to produce this model: ```yaml models: - model: Danielbrdz/Barcenas-Llama3-8b-ORPO parameters: weight: 1.0 - model: DeepMount00/Llama-3-8b-Ita parameters: weight: 1.0 - model: lightblue/suzume-llama-3-8B-multilingual parameters: weight: 1.0 merge_method: linear tokenizer_source: union dtype: float16