Ariel Lee commited on
Commit
655c2c5
1 Parent(s): 761ac41

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +13 -23
README.md CHANGED
@@ -3,34 +3,26 @@ language:
3
  - en
4
  tags:
5
  - llama
6
- license: apache-2.0
7
  metrics:
8
  - MMLU
9
  - ARC
10
  - HellaSwag
11
  - TruthfulQA
12
- - ReClor
13
  ---
14
 
15
- # 🥳 Platypus30B has arrived!
 
 
16
 
17
  | Metric | Value |
18
  |-----------------------|-------|
19
- | MMLU (5-shot) | 64.2 |
20
- | ARC (25-shot) | 76.7 |
21
  | HellaSwag (10-shot) | 84.3 |
22
- | TruthfulQA (0-shot) | 37.4 |
23
- | ReClor (0-shot) | 70 |
24
-
25
- ## Model Description
26
-
27
- Platypus30B is an instruction fine-tuned LlaMa model.
28
-
29
- ## Apply Delta Weights
30
-
31
- ```sh
32
- ADD
33
- ```
34
 
35
  ## Usage
36
 
@@ -41,7 +33,7 @@ ADD
41
  ## Model Details
42
 
43
  * **Trained by**: [Ariel Lee & Cole Hunter, LINK TO WEBSITES]
44
- * **Model type:** **Platypus30B** is an auto-regressive language model based on the LLaMA transformer architecture.
45
  * **Language(s)**: English
46
  * **License for base weights**: License for the base LLaMA model's weights is Meta's [non-commercial bespoke license](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
47
 
@@ -52,15 +44,13 @@ ADD
52
  | \\(n_\text{layers}\\) | 60 |
53
  | \\(n_\text{heads}\\) | 52 |
54
 
55
- ## Training
56
-
57
- ### Training Dataset
58
 
59
  Dataset of highly filtered and curated question and answer pairs. Release TBD.
60
 
61
- ### Training Procedure
62
 
63
- `lilloukas/Platypus30b` was instruction fine-tuned using lora [CITE REPO] on 2 A100 80GB with the following configuration:
64
 
65
  | Hyperparameter | Value |
66
  |---------------------|-------|
 
3
  - en
4
  tags:
5
  - llama
6
+ license: other
7
  metrics:
8
  - MMLU
9
  - ARC
10
  - HellaSwag
11
  - TruthfulQA
 
12
  ---
13
 
14
+ # 🥳 Platypus-30B has arrived!
15
+
16
+ Platypus-30B is an instruction fine-tuned model based on the LLaMA-30b transformer architecture.
17
 
18
  | Metric | Value |
19
  |-----------------------|-------|
20
+ | MMLU (5-shot) | 65.4 |
21
+ | ARC (25-shot) | 64.6 |
22
  | HellaSwag (10-shot) | 84.3 |
23
+ | TruthfulQA (0-shot) | 45.8 |
24
+ |-----------------------|-------|
25
+ | Avg. | 65 | 💥
 
 
 
 
 
 
 
 
 
26
 
27
  ## Usage
28
 
 
33
  ## Model Details
34
 
35
  * **Trained by**: [Ariel Lee & Cole Hunter, LINK TO WEBSITES]
36
+ * **Model type:** **Platypus-30B** is an auto-regressive language model based on the LLaMA transformer architecture.
37
  * **Language(s)**: English
38
  * **License for base weights**: License for the base LLaMA model's weights is Meta's [non-commercial bespoke license](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
39
 
 
44
  | \\(n_\text{layers}\\) | 60 |
45
  | \\(n_\text{heads}\\) | 52 |
46
 
47
+ ## Training Dataset
 
 
48
 
49
  Dataset of highly filtered and curated question and answer pairs. Release TBD.
50
 
51
+ ## Training Procedure
52
 
53
+ `lilloukas/Platypus-30b` was instruction fine-tuned using lora [CITE REPO] on 4 A100 80GB with the following configuration:
54
 
55
  | Hyperparameter | Value |
56
  |---------------------|-------|