unknown
commited on
Commit
Β·
0cea2c9
1
Parent(s):
0d418f2
Initial
Browse files- README.md +36 -33
- Scripts/Exp/{Accuracy β Acc}/Accurate_Func_Merged.csv +0 -0
- Scripts/Exp/{Accuracy β Acc}/calculate_accuracy.py +3 -3
- Scripts/Exp/{Accuracy β Acc}/calculate_purple.py +1 -1
- Scripts/Exp/{Accuracy β Acc}/wrong_func_list_def.csv +0 -0
- Scripts/Exp/Correction/calculate_correction.py +1 -1
- Scripts/Exp/ForkFlow/calculate_forkflow.py +4 -4
- Scripts/Exp/{Performance β Perf}/LLVM-RI5CY.csv +0 -0
- Scripts/Exp/{Performance β Perf}/LLVM-RISCV.csv +0 -0
- Scripts/Exp/{Performance β Perf}/LLVM-xCORE.csv +0 -0
- Scripts/Exp/{Performance β Perf}/calculate_perf.py +1 -1
- Scripts/Exp/Time/calculate_time.py +1 -1
- Scripts/UnixCoder/run_one_model.py +1 -1
- {saved_models/Fine_Tuned_Model β models/FT_Model}/checkpoint-best-acc/pytorch_model.bin +0 -0
- {saved_models/New_Fine_Tuned_Model β models/New_FT_Model}/.gitkeep +0 -0
- {saved_models β models}/UnixCoder/README.md +0 -0
- {saved_models β models}/UnixCoder/config.json +0 -0
- {saved_models β models}/UnixCoder/gitattributes.txt +0 -0
- {saved_models β models}/UnixCoder/merges.txt +0 -0
- {saved_models β models}/UnixCoder/pytorch_model.bin +0 -0
- {saved_models β models}/UnixCoder/special_tokens_map.json +0 -0
- {saved_models β models}/UnixCoder/tokenizer_config.json +0 -0
- {saved_models β models}/UnixCoder/vocab.json +0 -0
- run_fine_tuning.sh +2 -2
- run_function_test.sh +2 -2
- run_test.sh +2 -2
README.md
CHANGED
@@ -21,9 +21,9 @@ For detailed description of each script, please refer to the Artifact Appendix
|
|
21 |
```
|
22 |
VEGA_AE
|
23 |
|ββdataset
|
24 |
-
|ββ
|
25 |
-
| |ββ
|
26 |
-
| |ββ
|
27 |
| βββUnixCoder
|
28 |
|ββScripts
|
29 |
|ββExp
|
@@ -47,7 +47,7 @@ VEGA_AE
|
|
47 |
|
48 |
## 4. Code Generation
|
49 |
|
50 |
-
We have provided a fine-tuned model using data from ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl``` in ```./
|
51 |
|
52 |
We have also provided a script fot functionality test, which only generates a single function for RI5CY (Recorded as PULP in our dataset), taking less than 3 minutes with 8 Nividia Tesla V100 GPUs.
|
53 |
|
@@ -66,11 +66,11 @@ Upon completion of the code generation, the script outputs
|
|
66 |
$ " Finished Function Inferencing."
|
67 |
```
|
68 |
|
69 |
-
The inference result will be saved in ```./
|
70 |
|
71 |
Check the generated code with:
|
72 |
```
|
73 |
-
$ cat ./
|
74 |
```
|
75 |
|
76 |
In the `result.jsonl` file, the meaning of each item in each entry corresponds as follows:
|
@@ -89,15 +89,18 @@ In the `result.jsonl` file, the meaning of each item in each entry corresponds a
|
|
89 |
|
90 |
- **Run code generation with:**
|
91 |
|
|
|
|
|
|
|
92 |
```
|
93 |
$ bash run_test.sh
|
94 |
```
|
95 |
|
96 |
Customize parameters for inferencing by modifying following options in the ```run_test.sh```.
|
97 |
```
|
98 |
-
--model_name_or_path ../../
|
99 |
--test_filename ../../dataset/test.jsonl \
|
100 |
-
--output_dir ../../
|
101 |
--beam_size 4 \
|
102 |
--train_batch_size 256 \
|
103 |
--eval_batch_size 256 \
|
@@ -120,7 +123,7 @@ Upon completion of the code generation, the script outputs
|
|
120 |
$ " Finished Inferencing."
|
121 |
```
|
122 |
|
123 |
-
The inference result will be saved in ```./
|
124 |
|
125 |
|
126 |
|
@@ -129,7 +132,7 @@ The inference result will be saved in ```./saved_models/Fine_Tuned_Model/result.
|
|
129 |
|
130 |
Fine-tune the original UnixCoder-base-nine with provided ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl```:
|
131 |
|
132 |
-
We provide the original UnixCoder-base-nine in ```./
|
133 |
|
134 |
- **Run fine-tuning with:**
|
135 |
```
|
@@ -138,10 +141,10 @@ $ bash run_fine_tuning.sh
|
|
138 |
|
139 |
Customize parameters for fine-tuning by modifying following options in the ```run_fine_tuning.sh```.
|
140 |
```
|
141 |
-
--model_name_or_path ../../
|
142 |
--train_filename ../../dataset/train.jsonl \
|
143 |
--dev_filename ../../dataset/valid.jsonl \
|
144 |
-
--output_dir ../../
|
145 |
--beam_size 4 \
|
146 |
--train_batch_size 64 \
|
147 |
--eval_batch_size 48 \
|
@@ -160,14 +163,14 @@ We provide the scripts to reproduce each Figure/Table from the paper, along with
|
|
160 |
|
161 |
| Script | Description | Output | Figure/Table |
|
162 |
| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------- |
|
163 |
-
| ./Scripts/Exp/Time/calculate_time.py | Calculate the time overhead. | ./Scripts/Exp/Time/
|
164 |
-
| ./Scripts/Exp/
|
165 |
-
| ./Scripts/Exp/
|
166 |
-
| ./Scripts/Exp/
|
167 |
-
| ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the statement-level accracy of VEGA and ForkFlow. | ./Scripts/Exp/ForkFlow/
|
168 |
-
| ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the number of accurate statements of VEGA. | ./Scripts/Exp/ForkFlow/
|
169 |
-
| ./Scripts/Exp/Correction/calculate_correction.py | Calculate time required by two developers to modify the VEGA-generated RISC-V backend. | ./Scripts/Exp/Correction/
|
170 |
-
| ./Scripts/Exp/
|
171 |
### 6.1 Results for Fig. 7
|
172 |
|
173 |
In the code generation process, we set a batch size of 256 on 8 Nvidia Tesla V100 GPU (each with 16GB memory), meaning each batch contains 256 statements. Since each batch may include statements from different function modules, we did not directly measure the generation time for each function modules of three targets (RISC-V, RI5CY, xCORE) during execution. Instead, we calculated the average inference time of each batch (25 seconds) and then derived the inference time of each statement (25/256 seconds). With the total number of statements within each function module of each target, we subsequently calculated the total inference time required for each function module of each target.
|
@@ -181,7 +184,7 @@ $ python ./Scripts/Exp/Time/calculate_time.py
|
|
181 |
|
182 |
- Results:
|
183 |
```
|
184 |
-
$ cat ./Scripts/Exp/Time/
|
185 |
```
|
186 |
|
187 |
### 6.2 Results for Fig. 8
|
@@ -195,15 +198,15 @@ In this Exact Match evaluation, each statement is deemed correct if the VEGA-gen
|
|
195 |
|
196 |
- Command:
|
197 |
```
|
198 |
-
$ cp ./
|
199 |
-
$ python ./Scripts/Exp/
|
200 |
```
|
201 |
|
202 |
This script will automatically analyze the VEGA's output from "result.jsonl" and compare the generated code and confidence scores with the ground truth. Based on this comparison, it will determine whether each function is correct.
|
203 |
|
204 |
- Accuracy Results:
|
205 |
```
|
206 |
-
$ cat ./Scripts/Exp/
|
207 |
```
|
208 |
|
209 |
|
@@ -212,13 +215,13 @@ We also provide a script for calculating the proportion of "Accurate Functions w
|
|
212 |
|
213 |
- Command:
|
214 |
```
|
215 |
-
$ python ./Scripts/Exp/
|
216 |
```
|
217 |
|
218 |
|
219 |
- Results:
|
220 |
```
|
221 |
-
$ cat ./Scripts/Exp/
|
222 |
```
|
223 |
|
224 |
|
@@ -272,13 +275,13 @@ Executing the script in 6.2 will also yield the proportion of the three types of
|
|
272 |
|
273 |
- Command:
|
274 |
```
|
275 |
-
$ python ./Scripts/Exp/
|
276 |
```
|
277 |
|
278 |
|
279 |
- Results:
|
280 |
```
|
281 |
-
$ cat ./Scripts/Exp/
|
282 |
```
|
283 |
|
284 |
|
@@ -294,7 +297,7 @@ $ python ./Scripts/Exp/ForkFlow/calculate_forkflow.py
|
|
294 |
|
295 |
- Results:
|
296 |
```
|
297 |
-
$ cat ./Scripts/Exp/ForkFlow/
|
298 |
```
|
299 |
|
300 |
### 6.5 Results for Table. 3
|
@@ -310,7 +313,7 @@ $ python ./Scripts/Exp/ForkFlow/calculate_forkflow.py
|
|
310 |
|
311 |
- Results:
|
312 |
```
|
313 |
-
$ cat ./Scripts/Exp/ForkFlow/
|
314 |
```
|
315 |
|
316 |
|
@@ -328,7 +331,7 @@ $ python ./Scripts/Exp/Correction/calculate_correction.py
|
|
328 |
|
329 |
- Results:
|
330 |
```
|
331 |
-
$ cat ./Scripts/Exp/Correction/
|
332 |
```
|
333 |
|
334 |
### 6.7 Results for Fig. 10
|
@@ -340,12 +343,12 @@ By executing the following script, the speedup for VEGA-generated LLVM backend (
|
|
340 |
|
341 |
- Command:
|
342 |
```
|
343 |
-
$ python ./Scripts/Exp/
|
344 |
```
|
345 |
|
346 |
- Results:
|
347 |
```
|
348 |
-
$ cat ./Scripts/Exp/
|
349 |
```
|
350 |
|
351 |
|
|
|
21 |
```
|
22 |
VEGA_AE
|
23 |
|ββdataset
|
24 |
+
|ββmodels
|
25 |
+
| |ββFT_Model
|
26 |
+
| |ββNew_FT_Model
|
27 |
| βββUnixCoder
|
28 |
|ββScripts
|
29 |
|ββExp
|
|
|
47 |
|
48 |
## 4. Code Generation
|
49 |
|
50 |
+
We have provided a fine-tuned model using data from ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl``` in ```./models/FT_Model```. The ```train.jsonl``` and ```valid.jsonl``` files contain function templates, feature vectors and ground truth for 98 backends in our dataset.
|
51 |
|
52 |
We have also provided a script fot functionality test, which only generates a single function for RI5CY (Recorded as PULP in our dataset), taking less than 3 minutes with 8 Nividia Tesla V100 GPUs.
|
53 |
|
|
|
66 |
$ " Finished Function Inferencing."
|
67 |
```
|
68 |
|
69 |
+
The inference result will be saved in ```./models/FT_Model/result.jsonl```.
|
70 |
|
71 |
Check the generated code with:
|
72 |
```
|
73 |
+
$ cat ./models/FT_Model/result.jsonl
|
74 |
```
|
75 |
|
76 |
In the `result.jsonl` file, the meaning of each item in each entry corresponds as follows:
|
|
|
89 |
|
90 |
- **Run code generation with:**
|
91 |
|
92 |
+
|
93 |
+
The fine-tuned model will take function templates and feature vectors for RISC-V, RI5CY, and xCORE from ```./dataset/test.jsonl``` as input, generating code and confidence scores automatically.
|
94 |
+
|
95 |
```
|
96 |
$ bash run_test.sh
|
97 |
```
|
98 |
|
99 |
Customize parameters for inferencing by modifying following options in the ```run_test.sh```.
|
100 |
```
|
101 |
+
--model_name_or_path ../../models/UnixCoder \
|
102 |
--test_filename ../../dataset/test.jsonl \
|
103 |
+
--output_dir ../../models/FT_Model \
|
104 |
--beam_size 4 \
|
105 |
--train_batch_size 256 \
|
106 |
--eval_batch_size 256 \
|
|
|
123 |
$ " Finished Inferencing."
|
124 |
```
|
125 |
|
126 |
+
The inference result will be saved in ```./models/FT_Model/result.jsonl```.
|
127 |
|
128 |
|
129 |
|
|
|
132 |
|
133 |
Fine-tune the original UnixCoder-base-nine with provided ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl```:
|
134 |
|
135 |
+
We provide the original UnixCoder-base-nine in ```./models/UnixCoder```. The original UnixCoder-base-nine can also be downloaded from HuggingFace: https://huggingface.co/microsoft/unixcoder-base-nine.
|
136 |
|
137 |
- **Run fine-tuning with:**
|
138 |
```
|
|
|
141 |
|
142 |
Customize parameters for fine-tuning by modifying following options in the ```run_fine_tuning.sh```.
|
143 |
```
|
144 |
+
--model_name_or_path ../../models/UnixCoder \
|
145 |
--train_filename ../../dataset/train.jsonl \
|
146 |
--dev_filename ../../dataset/valid.jsonl \
|
147 |
+
--output_dir ../../models/New_FT_Model \
|
148 |
--beam_size 4 \
|
149 |
--train_batch_size 64 \
|
150 |
--eval_batch_size 48 \
|
|
|
163 |
|
164 |
| Script | Description | Output | Figure/Table |
|
165 |
| ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------- |
|
166 |
+
| ./Scripts/Exp/Time/calculate_time.py | Calculate the time overhead. | ./Scripts/Exp/Time/Fig7.csv | Fig.7 |
|
167 |
+
| ./Scripts/Exp/Acc/calculate_accuracy.py | Calculate the function-level accuracy. | ./Scripts/Exp/Acc/Fig8_Acc.csv | Fig.8 |
|
168 |
+
| ./Scripts/Exp/Acc/calculate_purple.py | Calculate the the percentage of functions accurately synthesized from the statements of various existing targets (Purple Bar in Fig.8). | ./Scripts/Exp/Acc/Fig8_Purple.csv | Fig.8 |
|
169 |
+
| ./Scripts/Exp/Acc/calculate_accuracy.py | Calculate the percentage of three types of error. | ./Scripts/Exp/Acc/Table2.csv | Table.2 |
|
170 |
+
| ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the statement-level accracy of VEGA and ForkFlow. | ./Scripts/Exp/ForkFlow/Fig9.csv | Fig.9 |
|
171 |
+
| ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the number of accurate statements of VEGA. | ./Scripts/Exp/ForkFlow/Table3.csv | Table.3 |
|
172 |
+
| ./Scripts/Exp/Correction/calculate_correction.py | Calculate time required by two developers to modify the VEGA-generated RISC-V backend. | ./Scripts/Exp/Correction/Table4.csv | Table. 4 |
|
173 |
+
| ./Scripts/Exp/Perf/calculate_perf.py | Calculate the speedup of LLVM-Base (-O3),and LLVM-VEGA (-O3) over LLVM-Base (-O0). | ./Scripts/Exp/Perf/Fig10.csv | Fig. 10 |
|
174 |
### 6.1 Results for Fig. 7
|
175 |
|
176 |
In the code generation process, we set a batch size of 256 on 8 Nvidia Tesla V100 GPU (each with 16GB memory), meaning each batch contains 256 statements. Since each batch may include statements from different function modules, we did not directly measure the generation time for each function modules of three targets (RISC-V, RI5CY, xCORE) during execution. Instead, we calculated the average inference time of each batch (25 seconds) and then derived the inference time of each statement (25/256 seconds). With the total number of statements within each function module of each target, we subsequently calculated the total inference time required for each function module of each target.
|
|
|
184 |
|
185 |
- Results:
|
186 |
```
|
187 |
+
$ cat ./Scripts/Exp/Time/Fig7.csv
|
188 |
```
|
189 |
|
190 |
### 6.2 Results for Fig. 8
|
|
|
198 |
|
199 |
- Command:
|
200 |
```
|
201 |
+
$ cp ./models/FT_Model/result.jsonl ./Scripts/Exp/Acc
|
202 |
+
$ python ./Scripts/Exp/Acc/calculate_accuracy.py
|
203 |
```
|
204 |
|
205 |
This script will automatically analyze the VEGA's output from "result.jsonl" and compare the generated code and confidence scores with the ground truth. Based on this comparison, it will determine whether each function is correct.
|
206 |
|
207 |
- Accuracy Results:
|
208 |
```
|
209 |
+
$ cat ./Scripts/Exp/Acc/Fig8_Acc.csv
|
210 |
```
|
211 |
|
212 |
|
|
|
215 |
|
216 |
- Command:
|
217 |
```
|
218 |
+
$ python ./Scripts/Exp/Acc/calculate_purple.py
|
219 |
```
|
220 |
|
221 |
|
222 |
- Results:
|
223 |
```
|
224 |
+
$ cat ./Scripts/Exp/Acc/Fig8_Purple.csv
|
225 |
```
|
226 |
|
227 |
|
|
|
275 |
|
276 |
- Command:
|
277 |
```
|
278 |
+
$ python ./Scripts/Exp/Acc/calculate_accuracy.py
|
279 |
```
|
280 |
|
281 |
|
282 |
- Results:
|
283 |
```
|
284 |
+
$ cat ./Scripts/Exp/Acc/Table2.csv
|
285 |
```
|
286 |
|
287 |
|
|
|
297 |
|
298 |
- Results:
|
299 |
```
|
300 |
+
$ cat ./Scripts/Exp/ForkFlow/Fig9.csv
|
301 |
```
|
302 |
|
303 |
### 6.5 Results for Table. 3
|
|
|
313 |
|
314 |
- Results:
|
315 |
```
|
316 |
+
$ cat ./Scripts/Exp/ForkFlow/Table3.csv
|
317 |
```
|
318 |
|
319 |
|
|
|
331 |
|
332 |
- Results:
|
333 |
```
|
334 |
+
$ cat ./Scripts/Exp/Correction/Table4.csv
|
335 |
```
|
336 |
|
337 |
### 6.7 Results for Fig. 10
|
|
|
343 |
|
344 |
- Command:
|
345 |
```
|
346 |
+
$ python ./Scripts/Exp/Perf/calculate_perf.py
|
347 |
```
|
348 |
|
349 |
- Results:
|
350 |
```
|
351 |
+
$ cat ./Scripts/Exp/Perf/Fig10.csv
|
352 |
```
|
353 |
|
354 |
|
Scripts/Exp/{Accuracy β Acc}/Accurate_Func_Merged.csv
RENAMED
File without changes
|
Scripts/Exp/{Accuracy β Acc}/calculate_accuracy.py
RENAMED
@@ -82,7 +82,7 @@ def calculate_accuracy():
|
|
82 |
|
83 |
all_func_lis = list(set(all_func_lis))
|
84 |
|
85 |
-
with open(folder+"/
|
86 |
f_csv = csv.writer(f)
|
87 |
avg_dic = {}
|
88 |
all_dic = {}
|
@@ -136,7 +136,7 @@ def calculate_accuracy():
|
|
136 |
|
137 |
if __name__ == '__main__':
|
138 |
get_wrong_list()
|
139 |
-
with open(folder+"/
|
140 |
f_csv = csv.writer(f)
|
141 |
f_csv.writerow(["Target", "Module", "Correct", "Total", "Accurate", "Inaccurate", "Confidence Scoreβ1.00", "Confidence Score in [0.50, 1.00)"])
|
142 |
total_dic = calculate_accuracy()
|
@@ -166,7 +166,7 @@ if __name__ == '__main__':
|
|
166 |
else:
|
167 |
target_func_num_dic[k.split(" ")[0].lower()] += len(list(set(total_dic[k])))
|
168 |
|
169 |
-
with open(folder+"/
|
170 |
f_csv = csv.writer(f)
|
171 |
for k in target_func_num_dic:
|
172 |
#print(target_func_num_dic[k])
|
|
|
82 |
|
83 |
all_func_lis = list(set(all_func_lis))
|
84 |
|
85 |
+
with open(folder+"/Fig8_Acc.csv", 'a', encoding='utf-8', newline="") as f:
|
86 |
f_csv = csv.writer(f)
|
87 |
avg_dic = {}
|
88 |
all_dic = {}
|
|
|
136 |
|
137 |
if __name__ == '__main__':
|
138 |
get_wrong_list()
|
139 |
+
with open(folder+"/Fig8_Acc.csv", 'w', encoding='utf-8', newline="") as f:
|
140 |
f_csv = csv.writer(f)
|
141 |
f_csv.writerow(["Target", "Module", "Correct", "Total", "Accurate", "Inaccurate", "Confidence Scoreβ1.00", "Confidence Score in [0.50, 1.00)"])
|
142 |
total_dic = calculate_accuracy()
|
|
|
166 |
else:
|
167 |
target_func_num_dic[k.split(" ")[0].lower()] += len(list(set(total_dic[k])))
|
168 |
|
169 |
+
with open(folder+"/Table2.csv", 'w', encoding='utf-8', newline = "") as f:
|
170 |
f_csv = csv.writer(f)
|
171 |
for k in target_func_num_dic:
|
172 |
#print(target_func_num_dic[k])
|
Scripts/Exp/{Accuracy β Acc}/calculate_purple.py
RENAMED
@@ -30,7 +30,7 @@ def calculate_template():
|
|
30 |
res_dic[" ".join([row[-1], row[0]]).lower()] = 1
|
31 |
else:
|
32 |
res_dic[" ".join([row[-1], row[0]]).lower()] += 1
|
33 |
-
with open(folder+"/
|
34 |
f_csv = csv.writer(f)
|
35 |
for k in res_dic.keys():
|
36 |
f_csv.writerow([k.split(' ')[0].replace("pulp", "ri5cy"), k.split(' ')[1], round(float(res_dic[k])/float(len(list(total_dic[k]))), 3)])
|
|
|
30 |
res_dic[" ".join([row[-1], row[0]]).lower()] = 1
|
31 |
else:
|
32 |
res_dic[" ".join([row[-1], row[0]]).lower()] += 1
|
33 |
+
with open(folder+"/Fig8_Purple.csv", 'w', encoding='utf-8', newline="") as f:
|
34 |
f_csv = csv.writer(f)
|
35 |
for k in res_dic.keys():
|
36 |
f_csv.writerow([k.split(' ')[0].replace("pulp", "ri5cy"), k.split(' ')[1], round(float(res_dic[k])/float(len(list(total_dic[k]))), 3)])
|
Scripts/Exp/{Accuracy β Acc}/wrong_func_list_def.csv
RENAMED
File without changes
|
Scripts/Exp/Correction/calculate_correction.py
CHANGED
@@ -26,7 +26,7 @@ result_A = sum_time_by_name("/Dev_A.csv")
|
|
26 |
|
27 |
result_B = sum_time_by_name("/Dev_B.csv")
|
28 |
|
29 |
-
with open(folder+"/
|
30 |
csv_writer = csv.writer(out_file)
|
31 |
for k in result_A.keys():
|
32 |
csv_writer.writerow(["Dev A", k, round(result_A[k]/3600.0, 2)])
|
|
|
26 |
|
27 |
result_B = sum_time_by_name("/Dev_B.csv")
|
28 |
|
29 |
+
with open(folder+"/Table4.csv", mode='w', newline='', encoding='utf-8') as out_file:
|
30 |
csv_writer = csv.writer(out_file)
|
31 |
for k in result_A.keys():
|
32 |
csv_writer.writerow(["Dev A", k, round(result_A[k]/3600.0, 2)])
|
Scripts/Exp/ForkFlow/calculate_forkflow.py
CHANGED
@@ -153,7 +153,7 @@ def duplicate_data(tar):
|
|
153 |
Mod_Result[module][1] += Mips_same
|
154 |
|
155 |
|
156 |
-
with open(folder+"/
|
157 |
f_csv = csv.writer(f)
|
158 |
avg_vega = 0.0
|
159 |
avg_mips = 0.0
|
@@ -163,7 +163,7 @@ def duplicate_data(tar):
|
|
163 |
avg_mips += float(round(kv[1][1]*1.0 / kv[1][0], 3))
|
164 |
f_csv.writerow([tar.replace("PULP", "RI5CY"), "Avg", round(avg_mips / len(Mod_Result), 3), round(avg_vega / len(Mod_Result), 3)])
|
165 |
|
166 |
-
with open(folder+"/
|
167 |
f_csv = csv.writer(f)
|
168 |
all_vega = 0
|
169 |
all_mips = 0
|
@@ -177,11 +177,11 @@ def duplicate_data(tar):
|
|
177 |
|
178 |
if __name__ == '__main__':
|
179 |
get_wrong_list()
|
180 |
-
with open(folder+"/
|
181 |
f_csv = csv.writer(f)
|
182 |
f_csv.writerow(["Target", "Module", "Fork_Acc", "VEGA_Acc"])
|
183 |
|
184 |
-
with open(folder+"/
|
185 |
f_csv = csv.writer(f)
|
186 |
f_csv.writerow(["Target", "Module", "VEGA_Accurate_Lines", "VEGA_Manual_Lines"])
|
187 |
|
|
|
153 |
Mod_Result[module][1] += Mips_same
|
154 |
|
155 |
|
156 |
+
with open(folder+"/Fig9.csv", 'a', encoding='utf-8', newline="") as f:
|
157 |
f_csv = csv.writer(f)
|
158 |
avg_vega = 0.0
|
159 |
avg_mips = 0.0
|
|
|
163 |
avg_mips += float(round(kv[1][1]*1.0 / kv[1][0], 3))
|
164 |
f_csv.writerow([tar.replace("PULP", "RI5CY"), "Avg", round(avg_mips / len(Mod_Result), 3), round(avg_vega / len(Mod_Result), 3)])
|
165 |
|
166 |
+
with open(folder+"/Table3.csv", 'a', encoding='utf-8', newline="") as f:
|
167 |
f_csv = csv.writer(f)
|
168 |
all_vega = 0
|
169 |
all_mips = 0
|
|
|
177 |
|
178 |
if __name__ == '__main__':
|
179 |
get_wrong_list()
|
180 |
+
with open(folder+"/Fig9.csv", 'w', encoding='utf-8', newline="") as f:
|
181 |
f_csv = csv.writer(f)
|
182 |
f_csv.writerow(["Target", "Module", "Fork_Acc", "VEGA_Acc"])
|
183 |
|
184 |
+
with open(folder+"/Table3.csv", 'w', encoding='utf-8', newline="") as f:
|
185 |
f_csv = csv.writer(f)
|
186 |
f_csv.writerow(["Target", "Module", "VEGA_Accurate_Lines", "VEGA_Manual_Lines"])
|
187 |
|
Scripts/Exp/{Performance β Perf}/LLVM-RI5CY.csv
RENAMED
File without changes
|
Scripts/Exp/{Performance β Perf}/LLVM-RISCV.csv
RENAMED
File without changes
|
Scripts/Exp/{Performance β Perf}/LLVM-xCORE.csv
RENAMED
File without changes
|
Scripts/Exp/{Performance β Perf}/calculate_perf.py
RENAMED
@@ -5,7 +5,7 @@ import time
|
|
5 |
|
6 |
folder = str(pathlib.Path(__file__).parent.resolve())
|
7 |
|
8 |
-
with open(folder+"/
|
9 |
csv_writer = csv.writer(out_file)
|
10 |
csv_writer.writerow(["Target", "Case", "LLVM-Base", "LLVM-VEGA"])
|
11 |
|
|
|
5 |
|
6 |
folder = str(pathlib.Path(__file__).parent.resolve())
|
7 |
|
8 |
+
with open(folder+"/Fig10.csv", mode='w', newline='', encoding='utf-8') as out_file:
|
9 |
csv_writer = csv.writer(out_file)
|
10 |
csv_writer.writerow(["Target", "Case", "LLVM-Base", "LLVM-VEGA"])
|
11 |
|
Scripts/Exp/Time/calculate_time.py
CHANGED
@@ -27,7 +27,7 @@ def calculate_time():
|
|
27 |
else:
|
28 |
Target_Module[dic["Target"]+" "+dic["Module"]] += 1
|
29 |
Func_Lis.append(dic["File"]+" "+dic["Func"])
|
30 |
-
with open(folder+"/
|
31 |
writer = csv.writer(f)
|
32 |
for kv in Target_Module.items():
|
33 |
writer.writerow(kv[0].replace("PULP", "RI5CY").split(" ") + [kv[1], math.ceil(kv[1] * 25.0 / 256)])
|
|
|
27 |
else:
|
28 |
Target_Module[dic["Target"]+" "+dic["Module"]] += 1
|
29 |
Func_Lis.append(dic["File"]+" "+dic["Func"])
|
30 |
+
with open(folder+"/Fig7.csv", "w",encoding="utf-8", newline = "") as f:
|
31 |
writer = csv.writer(f)
|
32 |
for kv in Target_Module.items():
|
33 |
writer.writerow(kv[0].replace("PULP", "RI5CY").split(" ") + [kv[1], math.ceil(kv[1] * 25.0 / 256)])
|
Scripts/UnixCoder/run_one_model.py
CHANGED
@@ -372,6 +372,7 @@ def vega_train_main():
|
|
372 |
set_seed(args.seed)
|
373 |
|
374 |
# make dir if output_dir not exist
|
|
|
375 |
if os.path.exists(args.output_dir) is False:
|
376 |
os.makedirs(args.output_dir)
|
377 |
args.model_name_or_path = folder + "/" + args.model_name_or_path
|
@@ -381,7 +382,6 @@ def vega_train_main():
|
|
381 |
args.dev_filename = folder + "/" + args.dev_filename
|
382 |
if args.test_filename:
|
383 |
args.test_filename = folder + "/" + args.test_filename
|
384 |
-
args.output_dir = folder + "/" + args.output_dir
|
385 |
# build model
|
386 |
tokenizer = RobertaTokenizer.from_pretrained(args.model_name_or_path)
|
387 |
config = RobertaConfig.from_pretrained(args.model_name_or_path)
|
|
|
372 |
set_seed(args.seed)
|
373 |
|
374 |
# make dir if output_dir not exist
|
375 |
+
args.output_dir = folder + "/" + args.output_dir
|
376 |
if os.path.exists(args.output_dir) is False:
|
377 |
os.makedirs(args.output_dir)
|
378 |
args.model_name_or_path = folder + "/" + args.model_name_or_path
|
|
|
382 |
args.dev_filename = folder + "/" + args.dev_filename
|
383 |
if args.test_filename:
|
384 |
args.test_filename = folder + "/" + args.test_filename
|
|
|
385 |
# build model
|
386 |
tokenizer = RobertaTokenizer.from_pretrained(args.model_name_or_path)
|
387 |
config = RobertaConfig.from_pretrained(args.model_name_or_path)
|
{saved_models/Fine_Tuned_Model β models/FT_Model}/checkpoint-best-acc/pytorch_model.bin
RENAMED
File without changes
|
{saved_models/New_Fine_Tuned_Model β models/New_FT_Model}/.gitkeep
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/README.md
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/config.json
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/gitattributes.txt
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/merges.txt
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/pytorch_model.bin
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/special_tokens_map.json
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/tokenizer_config.json
RENAMED
File without changes
|
{saved_models β models}/UnixCoder/vocab.json
RENAMED
File without changes
|
run_fine_tuning.sh
CHANGED
@@ -2,10 +2,10 @@
|
|
2 |
python ./Scripts/UnixCoder/run_one_model.py \
|
3 |
--do_train \
|
4 |
--do_eval \
|
5 |
-
--model_name_or_path ../../
|
6 |
--train_filename ../../dataset/train.jsonl \
|
7 |
--dev_filename ../../dataset/valid.jsonl \
|
8 |
-
--output_dir ../../
|
9 |
--beam_size 4 \
|
10 |
--train_batch_size 64 \
|
11 |
--eval_batch_size 48 \
|
|
|
2 |
python ./Scripts/UnixCoder/run_one_model.py \
|
3 |
--do_train \
|
4 |
--do_eval \
|
5 |
+
--model_name_or_path ../../models/UnixCoder \
|
6 |
--train_filename ../../dataset/train.jsonl \
|
7 |
--dev_filename ../../dataset/valid.jsonl \
|
8 |
+
--output_dir ../../models/New_FT_Model \
|
9 |
--beam_size 4 \
|
10 |
--train_batch_size 64 \
|
11 |
--eval_batch_size 48 \
|
run_function_test.sh
CHANGED
@@ -2,9 +2,9 @@
|
|
2 |
# do test
|
3 |
python ./Scripts/UnixCoder/run_one_model.py \
|
4 |
--do_function_test \
|
5 |
-
--model_name_or_path ../../
|
6 |
--test_filename ../../dataset/test.jsonl \
|
7 |
-
--output_dir ../../
|
8 |
--beam_size 4 \
|
9 |
--train_batch_size 256 \
|
10 |
--eval_batch_size 256 \
|
|
|
2 |
# do test
|
3 |
python ./Scripts/UnixCoder/run_one_model.py \
|
4 |
--do_function_test \
|
5 |
+
--model_name_or_path ../../models/UnixCoder \
|
6 |
--test_filename ../../dataset/test.jsonl \
|
7 |
+
--output_dir ../../models/FT_Model \
|
8 |
--beam_size 4 \
|
9 |
--train_batch_size 256 \
|
10 |
--eval_batch_size 256 \
|
run_test.sh
CHANGED
@@ -2,9 +2,9 @@
|
|
2 |
# do test
|
3 |
python ./Scripts/UnixCoder/run_one_model.py \
|
4 |
--do_test \
|
5 |
-
--model_name_or_path ../../
|
6 |
--test_filename ../../dataset/test.jsonl \
|
7 |
-
--output_dir ../../
|
8 |
--beam_size 4 \
|
9 |
--train_batch_size 256 \
|
10 |
--eval_batch_size 256 \
|
|
|
2 |
# do test
|
3 |
python ./Scripts/UnixCoder/run_one_model.py \
|
4 |
--do_test \
|
5 |
+
--model_name_or_path ../../models/UnixCoder \
|
6 |
--test_filename ../../dataset/test.jsonl \
|
7 |
+
--output_dir ../../models/FT_Model \
|
8 |
--beam_size 4 \
|
9 |
--train_batch_size 256 \
|
10 |
--eval_batch_size 256 \
|