lilloukas commited on
Commit
02008e1
1 Parent(s): 7cf09ba

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +90 -0
README.md ADDED
@@ -0,0 +1,90 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - en
4
+ tags:
5
+ - llama
6
+ license: other
7
+ metrics:
8
+ - MMLU
9
+ - ARC
10
+ - HellaSwag
11
+ - TruthfulQA
12
+ ---
13
+
14
+ # Information
15
+
16
+ GPlatty-30B is a merge of [lilloukas/Platypus-30B](https://huggingface.co/lilloukas/Platypus-30B) and [chansung/gpt4-alpaca-lora-30b](https://huggingface.co/chansung/gpt4-alpaca-lora-30b)
17
+
18
+ | Metric | Value |
19
+ |-----------------------|-------|
20
+ | MMLU (5-shot) | 63.6 |
21
+ | ARC (25-shot) | 66.0 |
22
+ | HellaSwag (10-shot) | 84.8 |
23
+ | TruthfulQA (0-shot) | 53.8 |
24
+ | Avg. | 67.0 |
25
+
26
+ We use state-of-the-art [Language Model Evaluation Harness](https://github.com/EleutherAI/lm-evaluation-harness) to run the benchmark tests above.
27
+
28
+ ## Model Details
29
+
30
+ * **Trained by**: Cole Hunter & Ariel Lee
31
+ * **Model type:** **GPlatty-30B** is an auto-regressive language model based on the LLaMA transformer architecture.
32
+ * **Language(s)**: English
33
+ * **License for base weights**: License for the base LLaMA model's weights is Meta's [non-commercial bespoke license](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
34
+
35
+ | Hyperparameter | Value |
36
+ |---------------------------|-------|
37
+ | \\(n_\text{parameters}\\) | 33B |
38
+ | \\(d_\text{model}\\) | 6656 |
39
+ | \\(n_\text{layers}\\) | 60 |
40
+ | \\(n_\text{heads}\\) | 52 |
41
+
42
+
43
+ ## Reproducing Evaluation Results
44
+ Install LM Evaluation Harness
45
+ ```
46
+ git clone https://github.com/EleutherAI/lm-evaluation-harness
47
+ cd lm-evaluation-harness
48
+ pip install -e .
49
+ ```
50
+ Each task was evaluated on a single A100 80GB GPU.
51
+
52
+ ARC
53
+ ```
54
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/GPlatty-30B --tasks arc_challenge --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/arc_challenge_25shot.json --device cuda --num_fewshot 25
55
+ ```
56
+
57
+ HellaSwag
58
+ ```
59
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/GPlatty-30B --tasks hellaswag --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/hellaswag_10shot.json --device cuda --num_fewshot 10
60
+ ```
61
+
62
+ MMLU
63
+ ```
64
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/GPlatty-30B --tasks hendrycksTest-* --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/mmlu_5shot.json --device cuda --num_fewshot 5
65
+ ```
66
+
67
+ TruthfulQA
68
+ ```
69
+ python main.py --model hf-causal-experimental --model_args pretrained=lilloukas/GPlatty-30B --tasks truthfulqa_mc --batch_size 1 --no_cache --write_out --output_path results/Platypus-30B/truthfulqa_0shot.json --device cuda
70
+ ```
71
+ ## Limitations and bias
72
+
73
+ The base LLaMA model is trained on various data, some of which may contain offensive, harmful, and biased content that can lead to toxic behavior. See Section 5.1 of the LLaMA paper. We have not performed any studies to determine how fine-tuning on the aforementioned datasets affect the model's behavior and toxicity. Do not treat chat responses from this model as a substitute for human judgment or as a source of truth. Please use responsibly.
74
+
75
+ ## Citations
76
+
77
+ ```bibtex
78
+ @article{touvron2023llama,
79
+ title={LLaMA: Open and Efficient Foundation Language Models},
80
+ author={Touvron, Hugo and Lavril, Thibaut and Izacard, Gautier and Martinet, Xavier and Lachaux, Marie-Anne and Lacroix, Timoth{\'e}e and Rozi{\`e}re, Baptiste and Goyal, Naman and Hambro, Eric and Azhar, Faisal and Rodriguez, Aurelien and Joulin, Armand and Grave, Edouard and Lample, Guillaume},
81
+ journal={arXiv preprint arXiv:2302.13971},
82
+ year={2023}
83
+ }
84
+ @article{hu2021lora,
85
+ title={LoRA: Low-Rank Adaptation of Large Language Models},
86
+ author={Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Chen, Weizhu},
87
+ journal={CoRR},
88
+ year={2021}
89
+ }
90
+ ```