|
--- |
|
language: |
|
- en |
|
license: cc-by-nc-4.0 |
|
model-index: |
|
- name: MN-12B-Lyra-v3 |
|
results: |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: IFEval (0-Shot) |
|
type: HuggingFaceH4/ifeval |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: inst_level_strict_acc and prompt_level_strict_acc |
|
value: 44.86 |
|
name: strict accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: BBH (3-Shot) |
|
type: BBH |
|
args: |
|
num_few_shot: 3 |
|
metrics: |
|
- type: acc_norm |
|
value: 25.87 |
|
name: normalized accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MATH Lvl 5 (4-Shot) |
|
type: hendrycks/competition_math |
|
args: |
|
num_few_shot: 4 |
|
metrics: |
|
- type: exact_match |
|
value: 7.18 |
|
name: exact match |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: GPQA (0-shot) |
|
type: Idavidrein/gpqa |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 3.69 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MuSR (0-shot) |
|
type: TAUR-Lab/MuSR |
|
args: |
|
num_few_shot: 0 |
|
metrics: |
|
- type: acc_norm |
|
value: 9.04 |
|
name: acc_norm |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
- task: |
|
type: text-generation |
|
name: Text Generation |
|
dataset: |
|
name: MMLU-PRO (5-shot) |
|
type: TIGER-Lab/MMLU-Pro |
|
config: main |
|
split: test |
|
args: |
|
num_few_shot: 5 |
|
metrics: |
|
- type: acc |
|
value: 24.99 |
|
name: accuracy |
|
source: |
|
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=Sao10K/MN-12B-Lyra-v3 |
|
name: Open LLM Leaderboard |
|
--- |
|
|
|
![Lyra](https://huggingface.co/Sao10K/MN-12B-Lyra-v3/resolve/main/Lyra.png) |
|
|
|
|
|
|
|
|
|
### Ungated. Thanks for the patience! |
|
|
|
|
|
--- |
|
|
|
|
|
Mistral-NeMo-12B-Lyra-v3, built on top of [Lyra-v2a2](https://huggingface.co/Sao10K/MN-12B-Lyra-v2a2), which itself was built upon [Lyra-v2a1](https://huggingface.co/Sao10K/MN-12B-Lyra-v2a1). |
|
|
|
# Model Versioning |
|
``` |
|
Lyra-v1 [Merge of Custom Roleplay & Instruct Trains, on Different Formats] |
|
| |
|
| [Additional SFT on 10% of Previous Data, Mixed] |
|
v |
|
Lyra-v2a1 |
|
| |
|
| [Low Rank SFT Step + Tokenizer Diddling] |
|
v |
|
Lyra-v2a2 |
|
| |
|
| [RL Step Performed on Multiturn Sets, Magpie-style Responses by Lyra-v2a2 for Rejected Data] |
|
v |
|
Lyra-v3 |
|
``` |
|
|
|
# This uses a custom ChatML-style prompting Format! |
|
|
|
\-> **What can go wrong?** |
|
|
|
``` |
|
[INST]system |
|
This is the system prompt.[/INST] |
|
[INST]user |
|
Instructions placed here.[/INST] |
|
[INST]assistant |
|
The model's response will be here.[/INST] |
|
``` |
|
|
|
`Why this? I had used the wrong configs by accident. The format was meant for an 8B pruned NeMo train, instead it went to this. Oops.` |
|
|
|
# Recommended Samplers: |
|
|
|
``` |
|
Temperature: 0.7 - 1.2 |
|
min_p: 0.1 - 0.2 # Crucial for NeMo |
|
``` |
|
|
|
# Recommended Stopping Strings: |
|
|
|
``` |
|
<|im_end|> |
|
</s> |
|
``` |
|
|
|
`Blame messed up Training Configs, oops?` |
|
|
|
# Training Metrics: |
|
|
|
\- Trained on 4xH100 SXM for 6 Hours. |
|
<br>\- Trained for 2 Epochs. |
|
<br>\- Effective Global Batch Size: 128. |
|
<br>\- Dataset Used: A custom, cleaned mix of Stheno-v3.4's Dataset, focused mainly on multiturn. |
|
|
|
--- |
|
|
|
# Extras |
|
|
|
Image Source: AI-Generated with FLUX.1 Dev. |
|
|
|
have a nice day. |
|
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) |
|
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Sao10K__MN-12B-Lyra-v3) |
|
|
|
| Metric |Value| |
|
|-------------------|----:| |
|
|Avg. |19.27| |
|
|IFEval (0-Shot) |44.86| |
|
|BBH (3-Shot) |25.87| |
|
|MATH Lvl 5 (4-Shot)| 7.18| |
|
|GPQA (0-shot) | 3.69| |
|
|MuSR (0-shot) | 9.04| |
|
|MMLU-PRO (5-shot) |24.99| |
|
|
|
|