mlabonne
/

ChimeraLlama-3-8B

@@ -4,6 +4,7 @@ tags:
 - merge
 - mergekit
 - lazymergekit
 base_model:
 - NousResearch/Meta-Llama-3-8B-Instruct
 - mlabonne/OrpoLlama-3-8B
@@ -11,11 +12,11 @@ base_model:
 - abacusai/Llama-3-Smaug-8B
 ---
-# Chimera-8B
-Chimera-8B outperforms Llama 3 8B Instruct on Nous' benchmark suite.
-Chimera-8B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
 * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
 * [Locutusque/Llama-3-Orca-1.0-8B](https://huggingface.co/Locutusque/Llama-3-Orca-1.0-8B)
@@ -29,7 +30,7 @@ Evaluation performed using [LLM AutoEval](https://github.com/mlabonne/llm-autoev
 | Model                                                                                                                                                                     |   Average |   AGIEval |   GPT4All | TruthfulQA |  Bigbench |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: |
-| [**mlabonne/Chimera-8B**](https://huggingface.co/mlabonne/Chimera-8B) [📄](https://gist.github.com/mlabonne/28d31153628dccf781b74f8071c7c7e4) | **51.58** | **39.12** | **71.81** | **52.4** | **42.98** |
 | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [📄](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) |     51.34 |     41.22 |     69.86 |      51.65 |     42.64 |
 | [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [📄](https://gist.github.com/mlabonne/22896a1ae164859931cc8f4858c97f6f)                     | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
 | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [📄](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847)                   |     45.42 |      31.1 |     69.95 |      43.91 |      36.7 |
@@ -73,7 +74,7 @@ from transformers import AutoTokenizer
 import transformers
 import torch
-model = "mlabonne/Chimera-8B"
 messages = [{"role": "user", "content": "What is a large language model?"}]
 tokenizer = AutoTokenizer.from_pretrained(model)

 - merge
 - mergekit
 - lazymergekit
+- llama
 base_model:
 - NousResearch/Meta-Llama-3-8B-Instruct
 - mlabonne/OrpoLlama-3-8B
 - abacusai/Llama-3-Smaug-8B
 ---
+# ChimeraLlama-3-8B
+ChimeraLlama-3-8B outperforms Llama 3 8B Instruct on Nous' benchmark suite.
+ChimeraLlama-3-8B is a merge of the following models using [LazyMergekit](https://colab.research.google.com/drive/1obulZ1ROXHjYLn6PPZJwRR6GzgQogxxb?usp=sharing):
 * [NousResearch/Meta-Llama-3-8B-Instruct](https://huggingface.co/NousResearch/Meta-Llama-3-8B-Instruct)
 * [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B)
 * [Locutusque/Llama-3-Orca-1.0-8B](https://huggingface.co/Locutusque/Llama-3-Orca-1.0-8B)
 | Model                                                                                                                                                                     |   Average |   AGIEval |   GPT4All | TruthfulQA |  Bigbench |
 | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------: | --------: | --------: | ---------: | --------: |
+| [**mlabonne/ChimeraLlama-3-8B**](https://huggingface.co/mlabonne/Chimera-8B) [📄](https://gist.github.com/mlabonne/28d31153628dccf781b74f8071c7c7e4) | **51.58** | **39.12** | **71.81** | **52.4** | **42.98** |
 | [meta-llama/Meta-Llama-3-8B-Instruct](https://huggingface.co/meta-llama/Meta-Llama-3-8B-Instruct) [📄](https://gist.github.com/mlabonne/8329284d86035e6019edb11eb0933628) |     51.34 |     41.22 |     69.86 |      51.65 |     42.64 |
 | [mlabonne/OrpoLlama-3-8B](https://huggingface.co/mlabonne/OrpoLlama-3-8B) [📄](https://gist.github.com/mlabonne/22896a1ae164859931cc8f4858c97f6f)                     | 48.63 | 34.17 | 70.59 | 52.39 | 37.36 |
 | [meta-llama/Meta-Llama-3-8B](https://huggingface.co/meta-llama/Meta-Llama-3-8B) [📄](https://gist.github.com/mlabonne/616b6245137a9cfc4ea80e4c6e55d847)                   |     45.42 |      31.1 |     69.95 |      43.91 |      36.7 |
 import transformers
 import torch
+model = "mlabonne/ChimeraLlama-3-8B"
 messages = [{"role": "user", "content": "What is a large language model?"}]
 tokenizer = AutoTokenizer.from_pretrained(model)