lilloukas commited on
Commit
0e1a3ca
1 Parent(s): f88dd6a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -48,7 +48,7 @@ Dataset of highly filtered and curated question and answer pairs. Release TBD.
48
  `lilloukas/Platypus-30B` was instruction fine-tuned using LoRA on 4 A100 80GB. For training details and inference instructions please see the [Platypus-30B](https://github.com/arielnlee/Platypus-30B.git) GitHub repo.
49
 
50
  ## Reproducing Evaluation Results
51
- Install LM Evaluation Harness
52
  ```
53
  git clone https://github.com/EleutherAI/lm-evaluation-harness
54
  cd lm-evaluation-harness
@@ -56,22 +56,22 @@ pip install -e .
56
  ```
57
  Each task was evaluated on a single A100 80GB GPU.
58
 
59
- ARC
60
  ```
61
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/arc_challenge_25shot.json --device cuda --num_fewshot 25
62
  ```
63
 
64
- HellaSwag
65
  ```
66
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/hellaswag_10shot.json --device cuda --num_fewshot 10
67
  ```
68
 
69
- MMLU
70
  ```
71
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/mmlu_5shot.json --device cuda --num_fewshot 5
72
  ```
73
 
74
- TruthfulQA
75
  ```
76
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/truthfulqa_0shot.json --device cuda
77
  ```
 
48
  `lilloukas/Platypus-30B` was instruction fine-tuned using LoRA on 4 A100 80GB. For training details and inference instructions please see the [Platypus-30B](https://github.com/arielnlee/Platypus-30B.git) GitHub repo.
49
 
50
  ## Reproducing Evaluation Results
51
+ Install LM Evaluation Harness:
52
  ```
53
  git clone https://github.com/EleutherAI/lm-evaluation-harness
54
  cd lm-evaluation-harness
 
56
  ```
57
  Each task was evaluated on a single A100 80GB GPU.
58
 
59
+ ARC:
60
  ```
61
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/arc_challenge_25shot.json --device cuda --num_fewshot 25
62
  ```
63
 
64
+ HellaSwag:
65
  ```
66
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/hellaswag_10shot.json --device cuda --num_fewshot 10
67
  ```
68
 
69
+ MMLU:
70
  ```
71
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/mmlu_5shot.json --device cuda --num_fewshot 5
72
  ```
73
 
74
+ TruthfulQA:
75
  ```
76
  python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/Platypus-30B --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/truthfulqa_0shot.json --device cuda
77
  ```