File size: 8,041 Bytes
49c1378 3406839 49c1378 50abe50 62023e4 6d81e70 62a941b 62023e4 70cc33a 62023e4 49c1378 62a941b 62023e4 49c1378 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 |
---
base_model:
- ruslandev/llama-3-8b-samantha
- MathGenie/MathCoder2-Llama-3-8B
- nvidia/OpenMath2-Llama3.1-8B
- NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
- TsinghuaC3I/Llama-3.1-8B-UltraMedical
- tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
- Skywork/Skywork-o1-Open-Llama-3.1-8B
- Undi95/Llama-3-LewdPlay-8B
- Locutusque/Hercules-6.1-Llama-3.1-8B
- RefuelAI/Llama-3-Refueled
- rombodawg/Llama-3-8B-Instruct-Coder
library_name: transformers
model-index:
- name: Llamaverse-3.1-8B-Instruct
results:
- task:
type: text-generation
name: Text Generation
dataset:
name: IFEval (0-Shot)
type: HuggingFaceH4/ifeval
args:
num_few_shot: 0
metrics:
- type: inst_level_strict_acc and prompt_level_strict_acc
value: 61.85
name: strict accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: BBH (3-Shot)
type: BBH
args:
num_few_shot: 3
metrics:
- type: acc_norm
value: 34.78
name: normalized accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MATH Lvl 5 (4-Shot)
type: hendrycks/competition_math
args:
num_few_shot: 4
metrics:
- type: exact_match
value: 18.43
name: exact match
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: GPQA (0-shot)
type: Idavidrein/gpqa
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 5.48
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MuSR (0-shot)
type: TAUR-Lab/MuSR
args:
num_few_shot: 0
metrics:
- type: acc_norm
value: 8.42
name: acc_norm
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
- task:
type: text-generation
name: Text Generation
dataset:
name: MMLU-PRO (5-shot)
type: TIGER-Lab/MMLU-Pro
config: main
split: test
args:
num_few_shot: 5
metrics:
- type: acc
value: 28.03
name: accuracy
source:
url: >-
https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=sethuiyer/Llamaverse-3.1-8B-Instruct
name: Open LLM Leaderboard
tags:
- mergekit
- merge
license: llama3.1
---
# Llamaverse 8B
![img](llama.webp)
**A Unified Multidisciplinary Language Model**
Llamaverse-3.1-8B-Instruct is a state-of-the-art language model built on the foundation of MathCoder2-Llama-3-8B, pretrained on MathCode-Pile. This dataset, which embeds mathematical reasoning steps in natural language and code, provides a rock-solid foundation for advanced logical reasoning. By merging MathCoder2 with 10 specialized models using the Model Stock merge method, Llamaverse-3.1-8B-Instruct becomes an unparalleled polymath, excelling in mathematics, biomedical diagnostics, storytelling, coding, and more.
This model was merged using the [Model Stock](https://arxiv.org/abs/2403.19522) merge method using [MathGenie/MathCoder2-Llama-3-8B](https://huggingface.co/MathGenie/MathCoder2-Llama-3-8B) as a base.
This model works amazing with the [Divine Intellect](https://raw.githubusercontent.com/oobabooga/text-generation-webui/ae8cd449ae3e0236ecb3775892bb1eea23f9ed68/presets/Divine%20Intellect.yaml) preset!
### Models Merged
The following models were included in the merge:
* [ruslandev/llama-3-8b-samantha](https://huggingface.co/ruslandev/llama-3-8b-samantha)
* [nvidia/OpenMath2-Llama3.1-8B](https://huggingface.co/nvidia/OpenMath2-Llama3.1-8B)
* [NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS](https://huggingface.co/NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS)
* [TsinghuaC3I/Llama-3.1-8B-UltraMedical](https://huggingface.co/TsinghuaC3I/Llama-3.1-8B-UltraMedical)
* [tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b](https://huggingface.co/tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b)
* [Skywork/Skywork-o1-Open-Llama-3.1-8B](https://huggingface.co/Skywork/Skywork-o1-Open-Llama-3.1-8B)
* [Undi95/Llama-3-LewdPlay-8B](https://huggingface.co/Undi95/Llama-3-LewdPlay-8B)
* [Locutusque/Hercules-6.1-Llama-3.1-8B](https://huggingface.co/Locutusque/Hercules-6.1-Llama-3.1-8B)
* [RefuelAI/Llama-3-Refueled](https://huggingface.co/RefuelAI/Llama-3-Refueled)
* [rombodawg/Llama-3-8B-Instruct-Coder](https://huggingface.co/rombodawg/Llama-3-8B-Instruct-Coder)
Model Stock Merge Ensures balanced integration of diverse expertise without dominance by any single model.
## **Multidisciplinary Expertise**
Llamaverse-3.1-8B-Instruct integrates the strengths of **10 specialized models**, including:
1. **ruslandev/llama-3-8b-samantha:** Empathetic and human-like interaction capabilities.
2. **nvidia/OpenMath2-Llama3.1-8B:** Advanced mathematical problem-solving.
3. **NeverSleep/Llama-3-Lumimaid-8B:** Creative storytelling and roleplay.
4. **TsinghuaC3I/Llama-3.1-8B-UltraMedical:** Clinical-grade biomedical insights.
5. **tohur/natsumura-storytelling-rp:** Immersive narrative generation.
6. **Skywork/Skywork-o1-Open-Llama-3.1-8B:** Reflective reasoning and complex problem-solving.
7. **Undi95/Llama-3-LewdPlay-8B:** Unconventional creativity and boundary-pushing dialogue.
8. **Locutusque/Hercules-6.1-Llama-3.1-8B:** Expertise in physics, biology, chemistry, and engineering.
9. **RefuelAI/Llama-3-Refueled:** Robust NLP capabilities for classification and entity extraction.
10. **rombodawg/Llama-3-8B-Instruct-Coder:** Efficient coding and software development.
## **Ethical Considerations**
Llamaverse-3.1-8B-Instruct must be used responsibly. Users should:
- Avoid deploying the model in high-stakes scenarios without human oversight.
- Be mindful of potential biases and ethical concerns, particularly in sensitive applications.
- Use the model’s creative and unconventional capabilities responsibly.
### Configuration
The following YAML configuration was used to produce this model:
```yaml
models:
- model: nvidia/OpenMath2-Llama3.1-8B
- model: Locutusque/Hercules-6.1-Llama-3.1-8B
- model: RefuelAI/Llama-3-Refueled
- model: Undi95/Llama-3-LewdPlay-8B
- model: ruslandev/llama-3-8b-samantha
- model: rombodawg/Llama-3-8B-Instruct-Coder
- model: Skywork/Skywork-o1-Open-Llama-3.1-8B
- model: NeverSleep/Llama-3-Lumimaid-8B-v0.1-OAS
- model: TsinghuaC3I/Llama-3.1-8B-UltraMedical
- model: tohur/natsumura-storytelling-rp-1.0-llama-3.1-8b
merge_method: model_stock
base_model: MathGenie/MathCoder2-Llama-3-8B
dtype: bfloat16
```
Built by merging models from MathGenie, ruslandev, nvidia, NeverSleep, TsinghuaC3I, tohur, Skywork, Undi95, Locutusque, RefuelAI, and rombodawg. Released under the Llama 3.1 Community License. |