euclaise
/

gpt-neox-122m-minipile-digits

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

gpt-neox-122m-minipile-digits / README.md

euclaise's picture

Adding Evaluation Results (#2)

be65142 about 1 year ago

|

history blame contribute delete

1.31 kB

	---
	license: cc0-1.0
	datasets:
	- JeanKaddour/minipile
	language:
	- en
	library_name: transformers
	---

	GPT-NeoX trained on MiniPile, for a baseline to compare my MANN models against. Uses [NeelNanda/gpt-neox-tokenizer-digits](https://huggingface.co/NeelNanda/gpt-neox-tokenizer-digits) for tokenization.

	The exact model configuration is as follows:
	```
	cfg = GPTNeoXConfig(
	vocab_size = len(tokenizer),
	hidden_size = 768,
	intermediate_size = 768*4,
	num_hidden_layers = 12,
	num_attention_heads = 12,
	tie_word_embeddings = True,
	hidden_act = "gelu_new",
	tokenizer = "NeelNanda/gpt-neox-tokenizer-digits"
	)
	```
	# [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard)
	Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/details_euclaise__gpt-neox-122m-minipile-digits)

	\| Metric \| Value \|
	\|-----------------------\|---------------------------\|
	\| Avg. \| 25.1 \|
	\| ARC (25-shot) \| 20.73 \|
	\| HellaSwag (10-shot) \| 27.03 \|
	\| MMLU (5-shot) \| 25.31 \|
	\| TruthfulQA (0-shot) \| 49.19 \|
	\| Winogrande (5-shot) \| 52.33 \|
	\| GSM8K (5-shot) \| 0.0 \|
	\| DROP (3-shot) \| 1.09 \|