Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,29 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: llama3
|
| 3 |
-
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: llama3
|
| 3 |
+
---
|
| 4 |
+
# 🔹 Key Highlights:
|
| 5 |
+
|
| 6 |
+
- 14% Fewer Parameters: nyun-llama3-60B comprises approximately 14% fewer parameters than the popular Llama-3-70B.
|
| 7 |
+
- Intact Performance: Despite having fewer parameters, our model performs at par if not better, and occasionally outperforms, the Llama-3-70B.
|
| 8 |
+
- No Fine-Tuning Required: This model undergoes no fine-tuning, showcasing the raw potential of our optimization techniques.
|
| 9 |
+
|
| 10 |
+
## Pipeline and Collaboration
|
| 11 |
+
|
| 12 |
+
For insights into the pipeline and the list of methods used to optimize these models, check out our PruneGPT repository (https://github.com/nyunAI/PruneGPT).
|
| 13 |
+
We invite companies and organizations interested in joining forces with us to release more such open-source variants to reach out at [email protected].
|
| 14 |
+
|
| 15 |
+
### Model Performance
|
| 16 |
+
|
| 17 |
+
| Dataset | Nyun-Llama3-60B | Meta-Llama3-70B | Meta-Llama2-70B | MBZUAI K2-65B |
|
| 18 |
+
| --- | --- | --- | --- | --- |
|
| 19 |
+
| MMLU (5-shot) | 78.6 | 79.5 | 69.7 | 67.9 |
|
| 20 |
+
| Winogrande (5-shot) | 83.4 | 83.1 | 81.8 | 77.0 |
|
| 21 |
+
| BoolQ (0-shot) | 85.2 | 79.0 | 73.1 | 83.0 |
|
| 22 |
+
| Hellaswag (10-shot) | 85.7 | 88.0 | 86.9 | 85.5 |
|
| 23 |
+
| Arc Challenge (25-shot) | 64.4 | 68.8 | 67.2 | 64.8 |
|
| 24 |
+
| GSM8K (5-shot) | 68.7 | 76.9 | 52.6 | 50.2 |
|
| 25 |
+
| Average | 77.7 | 79.2 | 71.9 | 71.4 |
|
| 26 |
+
|
| 27 |
+
|
| 28 |
+
- **Developed by:** [Nyun AI](https://nyunai.com/)
|
| 29 |
+
- **Repository:** [Github](https://github.com/nyunAI/PruneGPT)
|