jinjieyuan
commited on
Commit
·
b0a26c6
1
Parent(s):
bfacd96
Update README.md
Browse filesSigned-off-by: jinjieyuan <[email protected]>
README.md
CHANGED
@@ -5,13 +5,13 @@ license: apache-2.0
|
|
5 |
|
6 |
# Shears Model Card: shears-llama-7b-50-commonsense-heuristic
|
7 |
|
8 |
-
The heuristic subnetwork discovered from the super-network fine-tuned on LLaMA-7B with some commonsense reasoning datasets using Shears.
|
9 |
|
10 |
## Model Details
|
11 |
|
12 |
### Information
|
13 |
|
14 |
-
- **Model name:**
|
15 |
- **Base model:** [LLaMA-7b](https://huggingface.co/yahma/llama-7b-hf)
|
16 |
- **Sparsity:** 50%
|
17 |
- **Domain:** Commonsense
|
@@ -22,14 +22,14 @@ The heuristic subnetwork discovered from the super-network fine-tuned on LLaMA-7
|
|
22 |
|
23 |
- **LoRA rank:** 32
|
24 |
- **LoRA alpha:** 64
|
25 |
-
- **LoRA target modules:** q_proj, k_proj, v_proj, up_proj,
|
26 |
-
- **LoRA rank search space:** [32, 24, 16] (for each module)
|
27 |
|
28 |
### Training Hyperparameters
|
29 |
|
30 |
- **Batch size:** 16
|
31 |
- **Learning rate:** 3e-4
|
32 |
-
- **Epoch:**
|
33 |
|
34 |
### Training Data
|
35 |
|
@@ -38,10 +38,14 @@ Unified commonsense reasoning dataset: [commonsense_170k.json](https://github.co
|
|
38 |
### Evaluation Data
|
39 |
[BoolQ](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/boolq/test.json), [PIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/piqa/test.json), [SIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/social_i_qa/test.json), [HellaSwag](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/hellaswag/test.json), [WinoGrande](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/winogrande/test.json), [ARC-e](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Easy/test.json), [ARC-c](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Challenge/test.json), [OBQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/openbookqa/test.json).
|
40 |
|
41 |
-
|
42 |
-
|
43 |
## How to use
|
44 |
|
|
|
|
|
|
|
|
|
|
|
|
|
45 |
```python
|
46 |
import torch
|
47 |
from peft import PeftModel
|
|
|
5 |
|
6 |
# Shears Model Card: shears-llama-7b-50-commonsense-heuristic
|
7 |
|
8 |
+
The heuristic subnetwork discovered from the [super-network](https://huggingface.co/IntelLabs/shears-llama-7b-50-commonsense-super) fine-tuned on LLaMA-7B with some commonsense reasoning datasets using Shears.
|
9 |
|
10 |
## Model Details
|
11 |
|
12 |
### Information
|
13 |
|
14 |
+
- **Model name:** shears-llama-7b-50-commonsense-heuristic
|
15 |
- **Base model:** [LLaMA-7b](https://huggingface.co/yahma/llama-7b-hf)
|
16 |
- **Sparsity:** 50%
|
17 |
- **Domain:** Commonsense
|
|
|
22 |
|
23 |
- **LoRA rank:** 32
|
24 |
- **LoRA alpha:** 64
|
25 |
+
- **LoRA target modules:** q_proj, k_proj, v_proj, up_proj, down_proj
|
26 |
+
- **LoRA rank search space:** [32, 24, 16] (for each LoRA module)
|
27 |
|
28 |
### Training Hyperparameters
|
29 |
|
30 |
- **Batch size:** 16
|
31 |
- **Learning rate:** 3e-4
|
32 |
+
- **Epoch:** 5
|
33 |
|
34 |
### Training Data
|
35 |
|
|
|
38 |
### Evaluation Data
|
39 |
[BoolQ](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/boolq/test.json), [PIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/piqa/test.json), [SIQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/social_i_qa/test.json), [HellaSwag](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/hellaswag/test.json), [WinoGrande](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/winogrande/test.json), [ARC-e](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Easy/test.json), [ARC-c](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/ARC-Challenge/test.json), [OBQA](https://github.com/AGI-Edgerunners/LLM-Adapters/blob/main/dataset/openbookqa/test.json).
|
40 |
|
|
|
|
|
41 |
## How to use
|
42 |
|
43 |
+
Use our modified PEFT library (apply [patch](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears/patches/peft-modifications-for-shears-inference-usage.patch)):
|
44 |
+
```bash
|
45 |
+
git clone https://github.com/huggingface/peft.git
|
46 |
+
pushd peft && git checkout v0.5.0 && git apply --ignore-space-change --ignore-whitespace peft-modifications-for-shears-inference-usage.patch && pip install -e . && popd
|
47 |
+
```
|
48 |
+
|
49 |
```python
|
50 |
import torch
|
51 |
from peft import PeftModel
|