|
--- |
|
datasets: |
|
- BAAI/Infinity-Instruct |
|
base_model: |
|
- nvidia/Llama-3.1-Minitron-4B-Depth-Base |
|
--- |
|
|
|
We fine-tune nvidia/Llama-3.1-Minitron-4B-Depth-Base with LLM-Neo method,which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 100k lines. |
|
|
|
|
|
|
|
## Benchmarks |
|
|
|
In this section, we report the results for Llama-3.1-Minitron-4B-Depth-Neo-10w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library. |
|
|
|
### Evaluation results |
|
|
|
<table> |
|
<tr> |
|
<td><strong>Category</strong> |
|
</td> |
|
<td><strong>Benchmark</strong> |
|
</td> |
|
<td><strong>Version</strong> |
|
</td> |
|
<td><strong>n-shot</strong> |
|
</td> |
|
<td><strong>Metric</strong> |
|
</td> |
|
<td><strong>Value</strong> |
|
</td> |
|
<td><strong>Stderr</strong> |
|
</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="3" >BBH |
|
</td> |
|
<td>BBH (General)</td> |
|
<td>N/A</td> |
|
<td>3</td> |
|
<td>exact_match</td> |
|
<td>0.4729</td> |
|
<td>± 0.0055</td> |
|
</tr> |
|
<tr> |
|
<td>BBH (Boolean Expressions)</td> |
|
<td>2</td> |
|
<td>3</td> |
|
<td>exact_match</td> |
|
<td>0.8120</td> |
|
<td>± 0.0248</td> |
|
</tr> |
|
<tr> |
|
<td>BBH (Date Understanding)</td> |
|
<td>2</td> |
|
<td>3</td> |
|
<td>exact_match</td> |
|
<td>0.6600</td> |
|
<td>± 0.0300</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="4" >CEVAL |
|
</td> |
|
<td>CEVAL (General)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.4413</td> |
|
<td>± 0.0135</td> |
|
</tr> |
|
<tr> |
|
<td>CEVAL (Accountant)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.3469</td> |
|
<td>± 0.0687</td> |
|
</tr> |
|
<tr> |
|
<td>CEVAL (Advanced Mathematics)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.4737</td> |
|
<td>± 0.1177</td> |
|
</tr> |
|
<tr> |
|
<td>CEVAL (Art Studies)</td> |
|
<td>1</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.4545</td> |
|
<td>± 0.0880</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="3" >MMLU |
|
</td> |
|
<td>MMLU (General)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.6048</td> |
|
<td>± 0.0039</td> |
|
</tr> |
|
<tr> |
|
<td>MMLU (Humanities)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.5552</td> |
|
<td>± 0.0067</td> |
|
</tr> |
|
<tr> |
|
<td>MMLU (STEM)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.5214</td> |
|
<td>± 0.0086</td> |
|
</tr> |
|
<tr> |
|
<td rowspan="2" >CMMLU |
|
</td> |
|
<td>CMMLU (General)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc</td> |
|
<td>0.3548</td> |
|
<td>± 0.0044</td> |
|
</tr> |
|
<tr> |
|
<td>CMMLU (Normalized)</td> |
|
<td>N/A</td> |
|
<td>0</td> |
|
<td>acc_norm</td> |
|
<td>0.3548</td> |
|
<td>± 0.0044</td> |
|
</tr> |
|
</table> |
|
|