Update README.md
Browse files
README.md
CHANGED
@@ -141,18 +141,7 @@ name: Qwen2.5-14B-YOYO-latest-V2
|
|
141 |
Although the uncontrollable output issue has been addressed, the model still lacks stability.
|
142 |
|
143 |
Through practical experimentation, I found that first merging **"high-divergence"** models (significantly different from the base) into **"low-divergence"** models (closer to the base) using the [DELLA](https://arxiv.org/abs/2406.11617) method, then applying the [Model Stock](https://arxiv.org/abs/2403.19522) method, ultimately produces a model that is not only more stable but also achieves better performance.
|
144 |
-
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
145 |
-
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Replete-AI__Replete-LLM-V2.5-Qwen-14b)
|
146 |
|
147 |
-
| Metric |Value|
|
148 |
-
|-------------------|----:|
|
149 |
-
|Avg. |42.56|
|
150 |
-
|IFEval (0-Shot) |83.98|
|
151 |
-
|BBH (3-Shot) |49.47|
|
152 |
-
|MATH Lvl 5 (4-Shot)|53.55|
|
153 |
-
|GPQA (0-shot) |10.51|
|
154 |
-
|MuSR (0-shot) |11.10|
|
155 |
-
|MMLU-PRO (5-shot) |46.74|
|
156 |
## Key models used:
|
157 |
*1. Low-divergence, high-performance models:*
|
158 |
|
@@ -298,3 +287,16 @@ normalize: true
|
|
298 |
name: Qwen2.5-14B-1M-YOYO-V3
|
299 |
```
|
300 |
I hope this helps!
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
141 |
Although the uncontrollable output issue has been addressed, the model still lacks stability.
|
142 |
|
143 |
Through practical experimentation, I found that first merging **"high-divergence"** models (significantly different from the base) into **"low-divergence"** models (closer to the base) using the [DELLA](https://arxiv.org/abs/2406.11617) method, then applying the [Model Stock](https://arxiv.org/abs/2403.19522) method, ultimately produces a model that is not only more stable but also achieves better performance.
|
|
|
|
|
144 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
## Key models used:
|
146 |
*1. Low-divergence, high-performance models:*
|
147 |
|
|
|
287 |
name: Qwen2.5-14B-1M-YOYO-V3
|
288 |
```
|
289 |
I hope this helps!
|
290 |
+
|
291 |
+
# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
292 |
+
Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_Replete-AI__Replete-LLM-V2.5-Qwen-14b)
|
293 |
+
|
294 |
+
| Metric |Value|
|
295 |
+
|-------------------|----:|
|
296 |
+
|Avg. |42.56|
|
297 |
+
|IFEval (0-Shot) |83.98|
|
298 |
+
|BBH (3-Shot) |49.47|
|
299 |
+
|MATH Lvl 5 (4-Shot)|53.55|
|
300 |
+
|GPQA (0-shot) |10.51|
|
301 |
+
|MuSR (0-shot) |11.10|
|
302 |
+
|MMLU-PRO (5-shot) |46.74|
|