yang31210999
/

Llama-3.1-Minitron-4B-Depth-Neo-10w

Model card Files Files and versions Community

Llama-3.1-Minitron-4B-Depth-Neo-10w / README.md

yang31210999's picture

Update README.md

b6b561a verified about 2 months ago

|

history blame contribute delete

2.62 kB

	---
	datasets:
	- BAAI/Infinity-Instruct
	base_model:
	- nvidia/Llama-3.1-Minitron-4B-Depth-Base
	---

	We fine-tune nvidia/Llama-3.1-Minitron-4B-Depth-Base with LLM-Neo method，which combines LoRA and KD in one. Training data is sampling from BAAI/Infinity-Instruct for 100k lines.



	## Benchmarks

	In this section, we report the results for Llama-3.1-Minitron-4B-Depth-Neo-10w on standard automatic benchmarks. For all the evaluations, we use [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness) library.

	### Evaluation results

	<table>
	<tr>
	<td><strong>Category</strong>
	</td>
	<td><strong>Benchmark</strong>
	</td>
	<td><strong>Version</strong>
	</td>
	<td><strong>n-shot</strong>
	</td>
	<td><strong>Metric</strong>
	</td>
	<td><strong>Value</strong>
	</td>
	<td><strong>Stderr</strong>
	</td>
	</tr>
	<tr>
	<td rowspan="3" >BBH
	</td>
	<td>BBH (General)</td>
	<td>N/A</td>
	<td>3</td>
	<td>exact_match</td>
	<td>0.4729</td>
	<td>± 0.0055</td>
	</tr>
	<tr>
	<td>BBH (Boolean Expressions)</td>
	<td>2</td>
	<td>3</td>
	<td>exact_match</td>
	<td>0.8120</td>
	<td>± 0.0248</td>
	</tr>
	<tr>
	<td>BBH (Date Understanding)</td>
	<td>2</td>
	<td>3</td>
	<td>exact_match</td>
	<td>0.6600</td>
	<td>± 0.0300</td>
	</tr>
	<tr>
	<td rowspan="4" >CEVAL
	</td>
	<td>CEVAL (General)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc</td>
	<td>0.4413</td>
	<td>± 0.0135</td>
	</tr>
	<tr>
	<td>CEVAL (Accountant)</td>
	<td>1</td>
	<td>0</td>
	<td>acc</td>
	<td>0.3469</td>
	<td>± 0.0687</td>
	</tr>
	<tr>
	<td>CEVAL (Advanced Mathematics)</td>
	<td>1</td>
	<td>0</td>
	<td>acc</td>
	<td>0.4737</td>
	<td>± 0.1177</td>
	</tr>
	<tr>
	<td>CEVAL (Art Studies)</td>
	<td>1</td>
	<td>0</td>
	<td>acc</td>
	<td>0.4545</td>
	<td>± 0.0880</td>
	</tr>
	<tr>
	<td rowspan="3" >MMLU
	</td>
	<td>MMLU (General)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc</td>
	<td>0.6048</td>
	<td>± 0.0039</td>
	</tr>
	<tr>
	<td>MMLU (Humanities)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc</td>
	<td>0.5552</td>
	<td>± 0.0067</td>
	</tr>
	<tr>
	<td>MMLU (STEM)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc</td>
	<td>0.5214</td>
	<td>± 0.0086</td>
	</tr>
	<tr>
	<td rowspan="2" >CMMLU
	</td>
	<td>CMMLU (General)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc</td>
	<td>0.3548</td>
	<td>± 0.0044</td>
	</tr>
	<tr>
	<td>CMMLU (Normalized)</td>
	<td>N/A</td>
	<td>0</td>
	<td>acc_norm</td>
	<td>0.3548</td>
	<td>± 0.0044</td>
	</tr>
	</table>