lilloukas Ariel Lee commited on
Commit
feb7b4c
1 Parent(s): 4d109b3

Update README.md (#3)

Browse files

- Update README.md (c4ec16a65abe020b1e85dd5e2f0618984fe5b36f)


Co-authored-by: Ariel Lee <[email protected]>

Files changed (1) hide show
  1. README.md +4 -25
README.md CHANGED
@@ -13,7 +13,7 @@ metrics:
13
 
14
  # 🥳 Platypus-30B has arrived!
15
 
16
- Platypus-30B is an instruction fine-tuned model based on the LLaMA-30b transformer architecture.
17
 
18
  | Metric | Value |
19
  |-----------------------|-------|
@@ -21,18 +21,11 @@ Platypus-30B is an instruction fine-tuned model based on the LLaMA-30b transform
21
  | ARC (25-shot) | 64.6 |
22
  | HellaSwag (10-shot) | 84.3 |
23
  | TruthfulQA (0-shot) | 45.8 |
24
- |-----------------------|-------|
25
- | Avg. | 65 | 💥
26
-
27
- ## Usage
28
-
29
- ```sh
30
- ADD
31
- ```
32
 
33
  ## Model Details
34
 
35
- * **Trained by**: [Ariel Lee & Cole Hunter, LINK TO WEBSITES]
36
  * **Model type:** **Platypus-30B** is an auto-regressive language model based on the LLaMA transformer architecture.
37
  * **Language(s)**: English
38
  * **License for base weights**: License for the base LLaMA model's weights is Meta's [non-commercial bespoke license](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
@@ -50,21 +43,7 @@ Dataset of highly filtered and curated question and answer pairs. Release TBD.
50
 
51
  ## Training Procedure
52
 
53
- `lilloukas/Platypus-30b` was instruction fine-tuned using lora [CITE REPO] on 4 A100 80GB with the following configuration:
54
-
55
- | Hyperparameter | Value |
56
- |---------------------|-------|
57
- | learning_rate | --- |
58
- | batch_size | --- |
59
- | microbatch_size | --- |
60
- | warmup_steps | --- |
61
- | epochs | --- |
62
- | weight_decay | --- |
63
- | optimizer | --- |
64
- | weight_decay | --- |
65
- | cutoff_len | --- |
66
- | lora_target_modules | --- |
67
-
68
 
69
  ## Limitations and bias
70
 
 
13
 
14
  # 🥳 Platypus-30B has arrived!
15
 
16
+ Platypus-30B is an instruction fine-tuned model based on the LLaMA-30B transformer architecture and takes advantage of [LoRA]([LoRA](https://arxiv.org/pdf/2106.09685.pdf).
17
 
18
  | Metric | Value |
19
  |-----------------------|-------|
 
21
  | ARC (25-shot) | 64.6 |
22
  | HellaSwag (10-shot) | 84.3 |
23
  | TruthfulQA (0-shot) | 45.8 |
24
+ | Avg. | 65 |
 
 
 
 
 
 
 
25
 
26
  ## Model Details
27
 
28
+ * **Trained by**: Cole Hunter & Ariel Lee
29
  * **Model type:** **Platypus-30B** is an auto-regressive language model based on the LLaMA transformer architecture.
30
  * **Language(s)**: English
31
  * **License for base weights**: License for the base LLaMA model's weights is Meta's [non-commercial bespoke license](https://github.com/facebookresearch/llama/blob/main/MODEL_CARD.md).
 
43
 
44
  ## Training Procedure
45
 
46
+ `lilloukas/Platypus-30B` was instruction fine-tuned using LoRA on 4 A100 80GB. For training details and inference instructions please see the [Platypus-30B](https://github.com/arielnlee/Platypus-30B.git) GitHub repo.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
47
 
48
  ## Limitations and bias
49