PEFT
Safetensors
English
jinjieyuan commited on
Commit
65347b6
1 Parent(s): 4df8481

Add instruction for the sparse base model

Browse files
Files changed (1) hide show
  1. README.md +29 -3
README.md CHANGED
@@ -12,12 +12,38 @@ The heuristic adapter discovered from the [super-adapter](https://huggingface.co
12
  ### Information
13
 
14
  - **Model name:** shears-llama-7b-50-cs-heuristic-adapter
15
- - **Base model:** [IntelLabs/shears-llama-7b-50-base](https://huggingface.co/IntelLabs/shears-llama-7b-50-base)
16
  - **Sparsity:** 50%
17
  - **Domain:** Commonsense
18
  - **Subnetwork version:** Heuristic
19
  - **NNCF Configuration:** [nncf_shears_llama.json](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears/nncf_config/nncf_shears_llama.json)
20
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
21
  ### Adapter Configuration
22
 
23
  - **LoRA rank:** 32
@@ -61,14 +87,14 @@ def generate_prompt(instruction):
61
  ### Response:
62
  """
63
 
64
- base_model = AutoModelForCausalLM.from_pretrained("IntelLabs/shears-llama-7b-50-base")
65
  model = PeftModel.from_pretrained(base_model, "IntelLabs/shears-llama-7b-50-cs-heuristic-adapter")
66
  model.eval()
67
 
68
  non_zero_params = sum([(param.data != 0).sum().item() for _, param in model.named_parameters()])
69
  print(f"Number of all non-zero parameters: {non_zero_params}")
70
 
71
- tokenizer = AutoTokenizer.from_pretrained("IntelLabs/shears-llama-7b-50-base")
72
 
73
  instruction = "Please choose the correct answer to the question: A cactus stem is used to store\n\nAnswer1: fruit "
74
  "Answer2: liquid Answer3: food Answer4: spines\n\nAnswer format: answer1/answer2/answer3/answer4"
 
12
  ### Information
13
 
14
  - **Model name:** shears-llama-7b-50-cs-heuristic-adapter
15
+ - **Base model:** Sparsified [LLaMA-7B](https://huggingface.co/yahma/llama-7b-hf)
16
  - **Sparsity:** 50%
17
  - **Domain:** Commonsense
18
  - **Subnetwork version:** Heuristic
19
  - **NNCF Configuration:** [nncf_shears_llama.json](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears/nncf_config/nncf_shears_llama.json)
20
 
21
+ ### Sparsified Base Model
22
+
23
+ Shears employs a simple but effective pruning approach [Wanda](https://arxiv.org/abs/2306.11695) to sparsify the language model, serving as the base model.
24
+ Clone the [Wanda](https://github.com/locuslab/wanda) repo:
25
+
26
+ ```bash
27
+ git clone https://github.com/locuslab/wanda.git && cd wanda && git checkout 8e8fc87 && cd ..
28
+ ```
29
+
30
+ The command for unstructured sparsifying LLaMA-7B with Wanda, to achieve unstructured 50% sparsity:
31
+
32
+ ```bash
33
+ python wanda/main.py \
34
+ --model yahma/llama-7b-hf \
35
+ --prune_method wanda \
36
+ --sparsity_ratio 0.5 \
37
+ --sparsity_type unstructured \
38
+ --save wanda_out \
39
+ --save_model shears-llama-7b-50-base
40
+ ```
41
+ - `--model`: The identifier for the model on the Hugging Face model hub or local path.
42
+ - `--sparsity_ratio`: Specifies the percentage of weights to be pruned.
43
+ - `--save_model`: Specifies the directory where the pruned language model will be stored.
44
+
45
+ Refer to our [repo](https://github.com/IntelLabs/Hardware-Aware-Automated-Machine-Learning/tree/main/Shears#setup) for the environment information to run this command.
46
+
47
  ### Adapter Configuration
48
 
49
  - **LoRA rank:** 32
 
87
  ### Response:
88
  """
89
 
90
+ base_model = AutoModelForCausalLM.from_pretrained("shears-llama-7b-50-base")
91
  model = PeftModel.from_pretrained(base_model, "IntelLabs/shears-llama-7b-50-cs-heuristic-adapter")
92
  model.eval()
93
 
94
  non_zero_params = sum([(param.data != 0).sum().item() for _, param in model.named_parameters()])
95
  print(f"Number of all non-zero parameters: {non_zero_params}")
96
 
97
+ tokenizer = AutoTokenizer.from_pretrained("shears-llama-7b-50-base")
98
 
99
  instruction = "Please choose the correct answer to the question: A cactus stem is used to store\n\nAnswer1: fruit "
100
  "Answer2: liquid Answer3: food Answer4: spines\n\nAnswer format: answer1/answer2/answer3/answer4"