klein-zcy
/

Phi-1_5-MetaMathQA

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Phi-1_5-MetaMathQA / README.md

klein-zcy's picture

Update README.md

ebe9e80 verified 12 months ago

|

history blame contribute delete

2 kB

	---
	license: apache-2.0
	datasets:
	- meta-math/MetaMathQA
	language:
	- en
	---

	Supervised Finetuning the phi1.5 on MetaMathQA datasets. The results are as follows:

	\| Model \| GSM8k Pass@1 \| MATH Pass@1 \|
	\|---------------------\|--------------\|-------------\|
	\| MPT-7B \| 6.8 \| 3.0 \|
	\| Falcon-7B \| 6.8 \| 2.3 \|
	\| LLaMA-1-7B \| 11.0 \| 2.9 \|
	\| LLaMA-2-7B \| 14.6 \| 2.5 \|
	\| MPT-30B \| 15.2 \| 3.1 \|
	\| LLaMA-1-13B \| 17.8 \| 3.9 \|
	\| GPT-Neo-2.7B \| 19.5 \| -- \|
	\| Falcon-40B \| 19.6 \| 2.5 \|
	\| Baichuan-chat-13B \| 23.9 \| -- \|
	\| Vicuna-v1.3-13B \| 27.6 \| -- \|
	\| LLaMA-2-13B \| 28.7 \| 3.9 \|
	\| InternLM-7B \| 31.2 \| -- \|
	\| ChatGLM-2-6B \| 32.4 \| -- \|
	\| GPT-J-6B \| 34.9 \| -- \|
	\| LLaMA-1-33B \| 35.6 \| 3.9 \|
	\| LLaMA-2-34B \| 42.2 \| 6.24 \|
	\| RFT-7B \| 50.3 \| -- \|
	\| LLaMA-1-65B \| 50.9 \| 10.6 \|
	\| Qwen-7B \| 51.6 \| -- \|
	\| Phi1.5-1.3B \| 54.3 \| 15.5 \|
	\| WizardMath-7B \| 54.9 \| 10.7 \|
	\| LLaMA-2-70B \| 56.8 \| 13.5 \|
	\| WizardMath-13B \| 63.9 \| 14.0 \|
	\| MAmmoTH-7B (COT) \| 50.5 \| 10.4 \|
	\| MAmmoTH-7B (POT+COT)\| 53.6 \| 31.5 \|
	\| Arithmo-Mistral-7B \| 74.7 \| 25.3 \|
	\| MetaMath-7B \| 66.5 \| 19.8 \|
	\| MetaMath-13B \| 72.3 \| 22.4 \|
	\| MetaMath-Mistral-7B \| 77.7 \| 28.2 \|


	It achieves remarkable performance with only 1.3B parameters !!!

	You can evaluate the results by [metamath](https://huggingface.co/meta-math/MetaMath-Mistral-7B) evaluation code.