neuralmagic
/

Meta-Llama-3.1-405B-Instruct-quantized.w4a16

Text Generation

compressed-tensors

Model card Files Files and versions Community

abhinavnmagic commited on Aug 9, 2024

Commit

a8c9e50

·

verified ·

1 Parent(s): 2abcd4a

Update README.md

Files changed (1) hide show

README.md +4 -8

README.md CHANGED Viewed

@@ -124,14 +124,10 @@ model.save_pretrained("Meta-Llama-3.1-405B-Instruct-quantized.w4a16")
 ## Evaluation
-The model was evaluated on the [OpenLLM](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard) leaderboard tasks (version 1) with the [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness/tree/383bbd54bc621086e05aa1b030d8d4d5635b25e6) (commit 383bbd54bc621086e05aa1b030d8d4d5635b25e6) and the [vLLM](https://docs.vllm.ai/en/stable/) engine, using the following command:
-```
-lm_eval \
-  --model vllm \
-  --model_args pretrained="neuralmagic/Meta-Llama-3.1-405B-Instruct-quantized.w4a16",dtype=auto,gpu_memory_utilization=0.4,add_bos_token=True,max_model_len=4096,tensor_parallel_size=1 \
-  --tasks openllm \
-  --batch_size auto
-```
 ### Accuracy

 ## Evaluation
+The model was evaluated on MMLU, ARC-Challenge, GSM-8K, Hellaswag, Winogrande and TruthfulQA.
+Evaluation was conducted using the Neural Magic fork of [lm-evaluation-harness](https://github.com/neuralmagic/lm-evaluation-harness/tree/llama_3.1_instruct) (branch llama_3.1_instruct) and the [vLLM](https://docs.vllm.ai/en/stable/) engine.
+This version of the lm-evaluation-harness includes versions of ARC-Challenge, GSM-8K, and MMLU that match the prompting style of [Meta-Llama-3.1-Instruct-evals](https://huggingface.co/datasets/meta-llama/Meta-Llama-3.1-8B-Instruct-evals).
 ### Accuracy