--- base_model: - NousResearch/Meta-Llama-3-8B-Instruct - NousResearch/Meta-Llama-3-8B library_name: transformers tags: - mergekit - merge --- # Meta-Llama-3-8B-InitializedEmbeds This is just [Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) with the embeddings for special tokens copied from the Instruct version. Should behave pretty much identically to the base model, but with less glossolalia when it encounters `<|start_header_id|>` and the like. I'm using this as a base to fine tune. Having these embeddings reasonable instead of randomly initialized should give a smoother start. This is a merge of pre-trained language models created using [mergekit](https://github.com/cg123/mergekit). ## Merge Details ### Merge Method This model was merged using the [linear](https://arxiv.org/abs/2203.05482) merge method. ### Models Merged The following models were included in the merge: * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct) * [NousResearch/Meta-Llama-3-8B](https://huggingface.co/NousResearch/Meta-Llama-3-8B) ### Configuration The following YAML configuration was used to produce this model: ```yaml merge_method: linear dtype: float32 out_dtype: bfloat16 models: - model: NousResearch/Meta-Llama-3-8B parameters: weight: 1.0 - model: NousResearch/Meta-Llama-3-8B-Instruct parameters: weight: 0.0 tokenizer: source: NousResearch/Meta-Llama-3-8B-Instruct tokens: <|start_header_id|>: source: NousResearch/Meta-Llama-3-8B-Instruct force: true <|end_header_id|>: source: NousResearch/Meta-Llama-3-8B-Instruct force: true <|eot_id|>: source: NousResearch/Meta-Llama-3-8B-Instruct force: true <|end_of_text|>: source: NousResearch/Meta-Llama-3-8B-Instruct force: true <|begin_of_text|>: source: NousResearch/Meta-Llama-3-8B-Instruct force: true ```