lilloukas commited on
Commit
f88dd6a
1 Parent(s): 5fdf80b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -2
README.md CHANGED
@@ -54,9 +54,26 @@ git clone https://github.com/EleutherAI/lm-evaluation-harness
54
  cd lm-evaluation-harness
55
  pip install -e .
56
  ```
57
- Each task (arc_challenge|25-shot, hellaswag|10-shot, hendrycksTest-*|5-shot, truthfulqa_mc|0-shot) was evaluated on a single A100 80GB GPU, using the following:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
58
  ```
59
- python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks <TASK> --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/<TASK>.json --device cuda --num_fewshot <NUM_FEWSHOT>
60
  ```
61
  ## Limitations and bias
62
 
 
54
  cd lm-evaluation-harness
55
  pip install -e .
56
  ```
57
+ Each task was evaluated on a single A100 80GB GPU.
58
+
59
+ ARC
60
+ ```
61
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/arc_challenge_25shot.json --device cuda --num_fewshot 25
62
+ ```
63
+
64
+ HellaSwag
65
+ ```
66
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/hellaswag_10shot.json --device cuda --num_fewshot 10
67
+ ```
68
+
69
+ MMLU
70
+ ```
71
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/mmlu_5shot.json --device cuda --num_fewshot 5
72
+ ```
73
+
74
+ TruthfulQA
75
  ```
76
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/truthfulqa_0shot.json --device cuda
77
  ```
78
  ## Limitations and bias
79