|
--- |
|
license: llama3.1 |
|
model-index: |
|
- name: Llama-Spark |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 79.11 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 29.77 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 1.06 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 6.6 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 2.62 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 30.23 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=arcee-ai/Llama-Spark |
|
name: Open LLM Leaderboard |
|
--- |
|
<div align="center"> |
|
<img src="https://i.ibb.co/9hwFrvL/BLMs-Wkx-NQf-W-46-FZDg-ILhg.jpg" alt="Arcee Spark" style="border-radius: 10px; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19); max-width: 100%; height: auto;"> |
|
</div> |
|
|
|
|
|
Llama-Spark is a powerful conversational AI model developed by Arcee.ai. It's built on the foundation of Llama-3.1-8B and merges the power of our Tome Dataset with Llama-3.1-8B-Instruct, resulting in a remarkable conversationalist that punches well above its 8B parameter weight class. |
|
|
|
## GGUFs available [here](https://huggingface.co/arcee-ai/Llama-Spark-GGUF) |
|
|
|
## Model Description |
|
|
|
Llama-Spark is our commitment to consistently delivering the best-performing conversational AI in the 6-9B parameter range. As new base models become available, we'll continue to update and improve Spark to maintain its leadership position. |
|
|
|
This model is a successor to our original Arcee-Spark, incorporating advancements and learnings from our ongoing research and development. |
|
|
|
## Intended Uses |
|
|
|
Llama-Spark is intended for use in conversational AI applications, such as chatbots, virtual assistants, and dialogue systems. It excels at engaging in natural and informative conversations. |
|
|
|
## Training Information |
|
|
|
Llama-Spark is built upon the Llama-3.1-8B base model, fine-tuned using of the Tome Dataset and merged with Llama-3.1-8B-Instruct. |
|
## Evaluation Results |
|
Please note that these scores are consistantly higher than the OpenLLM leaderboard, and should be compared to their relative performance increase not weighed against the leaderboard. |
|
<div align="center"> |
|
<img src="https://i.ibb.co/pfSGLtB/Screenshot-2024-08-01-at-11-40-42-PM.png" alt="Arcee Spark" style="border-radius: 10px; box-shadow: 0 4px 8px 0 rgba(0, 0, 0, 0.2), 0 6px 20px 0 rgba(0, 0, 0, 0.19); max-width: 100%; height: auto;"> |
|
</div> |
|
|
|
## Acknowledgements |
|
|
|
We extend our deepest gratitude to **PrimeIntellect** for being our compute sponsor for this project. |
|
|
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_arcee-ai__Llama-Spark) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |24.90| |
|
|IFEval (0-Shot) |79.11| |
|
|BBH (3-Shot) |29.77| |
|
|MATH Lvl 5 (4-Shot)| 1.06| |
|
|GPQA (0-shot) | 6.60| |
|
|MuSR (0-shot) | 2.62| |
|
|MMLU-PRO (5-shot) |30.23| |
|
|
|
|