unknown commited on
Commit
0cea2c9
Β·
1 Parent(s): 0d418f2
README.md CHANGED
@@ -21,9 +21,9 @@ For detailed description of each script, please refer to the Artifact Appendix
21
  ```
22
  VEGA_AE
23
  |──dataset
24
- |──saved_models
25
- | |──Fine_Tuned_Model
26
- | |──New_Fine_Tuned_Model
27
  | └──UnixCoder
28
  |β€”β€”Scripts
29
  |──Exp
@@ -47,7 +47,7 @@ VEGA_AE
47
 
48
  ## 4. Code Generation
49
 
50
- We have provided a fine-tuned model using data from ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl``` in ```./saved_models/Fine_Tuned_Model```. The ```train.jsonl``` and ```valid.jsonl``` files contain code from 98 backends in our dataset.
51
 
52
  We have also provided a script fot functionality test, which only generates a single function for RI5CY (Recorded as PULP in our dataset), taking less than 3 minutes with 8 Nividia Tesla V100 GPUs.
53
 
@@ -66,11 +66,11 @@ Upon completion of the code generation, the script outputs
66
  $ " Finished Function Inferencing."
67
  ```
68
 
69
- The inference result will be saved in ```./saved_models/Fine_Tuned_Model/result.jsonl```.
70
 
71
  Check the generated code with:
72
  ```
73
- $ cat ./saved_models/Fine_Tuned_Model/result.jsonl
74
  ```
75
 
76
  In the `result.jsonl` file, the meaning of each item in each entry corresponds as follows:
@@ -89,15 +89,18 @@ In the `result.jsonl` file, the meaning of each item in each entry corresponds a
89
 
90
  - **Run code generation with:**
91
 
 
 
 
92
  ```
93
  $ bash run_test.sh
94
  ```
95
 
96
  Customize parameters for inferencing by modifying following options in the ```run_test.sh```.
97
  ```
98
- --model_name_or_path ../../saved_models/UnixCoder \
99
  --test_filename ../../dataset/test.jsonl \
100
- --output_dir ../../saved_models/Fine_Tuned_Model \
101
  --beam_size 4 \
102
  --train_batch_size 256 \
103
  --eval_batch_size 256 \
@@ -120,7 +123,7 @@ Upon completion of the code generation, the script outputs
120
  $ " Finished Inferencing."
121
  ```
122
 
123
- The inference result will be saved in ```./saved_models/Fine_Tuned_Model/result.jsonl```.
124
 
125
 
126
 
@@ -129,7 +132,7 @@ The inference result will be saved in ```./saved_models/Fine_Tuned_Model/result.
129
 
130
  Fine-tune the original UnixCoder-base-nine with provided ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl```:
131
 
132
- We provide the original UnixCoder-base-nine in ```./saved_models/UnixCoder```. The original UnixCoder-base-nine can also be downloaded from HuggingFace: https://huggingface.co/microsoft/unixcoder-base-nine.
133
 
134
  - **Run fine-tuning with:**
135
  ```
@@ -138,10 +141,10 @@ $ bash run_fine_tuning.sh
138
 
139
  Customize parameters for fine-tuning by modifying following options in the ```run_fine_tuning.sh```.
140
  ```
141
- --model_name_or_path ../../saved_models/UnixCoder \
142
  --train_filename ../../dataset/train.jsonl \
143
  --dev_filename ../../dataset/valid.jsonl \
144
- --output_dir ../../saved_models/New_Fine_Tuned_Model \
145
  --beam_size 4 \
146
  --train_batch_size 64 \
147
  --eval_batch_size 48 \
@@ -160,14 +163,14 @@ We provide the scripts to reproduce each Figure/Table from the paper, along with
160
 
161
  | Script | Description | Output | Figure/Table |
162
  | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------- |
163
- | ./Scripts/Exp/Time/calculate_time.py | Calculate the time overhead. | ./Scripts/Exp/Time/time_overhead.csv | Fig.7 |
164
- | ./Scripts/Exp/Accuracy/calculate_accuracy.py | Calculate the function-level accuracy. | ./Scripts/Exp/Accuracy/vega_result.csv | Fig.8 |
165
- | ./Scripts/Exp/Accuracy/calculate_purple.py | Calculate the the percentage of functions accurately synthesized from the statements of various existing targets (Purple Bar in Fig.8). | ./Scripts/Exp/Accuracy/fig8_purple.csv | Fig.8 |
166
- | ./Scripts/Exp/Accuracy/calculate_accuracy.py | Calculate the percentage of three types of error. | ./Scripts/Exp/Accuracy/err_percentage.csv | Table.2 |
167
- | ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the statement-level accracy of VEGA and ForkFlow. | ./Scripts/Exp/ForkFlow/forkflow_result.csv | Fig.9 |
168
- | ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the number of accurate statements of VEGA. | ./Scripts/Exp/ForkFlow/mod_lines.csv | Table.3 |
169
- | ./Scripts/Exp/Correction/calculate_correction.py | Calculate time required by two developers to modify the VEGA-generated RISC-V backend. | ./Scripts/Exp/Correction/Correction.csv | Table. 4 |
170
- | ./Scripts/Exp/Performance/calculate_perf.py | Calculate the speedup of LLVM-Base (-O3),and LLVM-VEGA (-O3) over LLVM-Base (-O0). | ./Scripts/Exp/Performance/Perf.csv | Fig. 10 |
171
  ### 6.1 Results for Fig. 7
172
 
173
  In the code generation process, we set a batch size of 256 on 8 Nvidia Tesla V100 GPU (each with 16GB memory), meaning each batch contains 256 statements. Since each batch may include statements from different function modules, we did not directly measure the generation time for each function modules of three targets (RISC-V, RI5CY, xCORE) during execution. Instead, we calculated the average inference time of each batch (25 seconds) and then derived the inference time of each statement (25/256 seconds). With the total number of statements within each function module of each target, we subsequently calculated the total inference time required for each function module of each target.
@@ -181,7 +184,7 @@ $ python ./Scripts/Exp/Time/calculate_time.py
181
 
182
  - Results:
183
  ```
184
- $ cat ./Scripts/Exp/Time/time_overhead.csv
185
  ```
186
 
187
  ### 6.2 Results for Fig. 8
@@ -195,15 +198,15 @@ In this Exact Match evaluation, each statement is deemed correct if the VEGA-gen
195
 
196
  - Command:
197
  ```
198
- $ cp ./saved_models/Fine_Tuned_Model/result.jsonl ./Scripts/Exp/Accuracy
199
- $ python ./Scripts/Exp/Accuracy/calculate_accuracy.py
200
  ```
201
 
202
  This script will automatically analyze the VEGA's output from "result.jsonl" and compare the generated code and confidence scores with the ground truth. Based on this comparison, it will determine whether each function is correct.
203
 
204
  - Accuracy Results:
205
  ```
206
- $ cat ./Scripts/Exp/Accuracy/vega_result.csv
207
  ```
208
 
209
 
@@ -212,13 +215,13 @@ We also provide a script for calculating the proportion of "Accurate Functions w
212
 
213
  - Command:
214
  ```
215
- $ python ./Scripts/Exp/Accuracy/calculate_purple.py
216
  ```
217
 
218
 
219
  - Results:
220
  ```
221
- $ cat ./Scripts/Exp/Accuracy/fig8_purple.csv
222
  ```
223
 
224
 
@@ -272,13 +275,13 @@ Executing the script in 6.2 will also yield the proportion of the three types of
272
 
273
  - Command:
274
  ```
275
- $ python ./Scripts/Exp/Accuracy/calculate_accuracy.py
276
  ```
277
 
278
 
279
  - Results:
280
  ```
281
- $ cat ./Scripts/Exp/Accuracy/err_percentage.csv
282
  ```
283
 
284
 
@@ -294,7 +297,7 @@ $ python ./Scripts/Exp/ForkFlow/calculate_forkflow.py
294
 
295
  - Results:
296
  ```
297
- $ cat ./Scripts/Exp/ForkFlow/forkflow_result.csv
298
  ```
299
 
300
  ### 6.5 Results for Table. 3
@@ -310,7 +313,7 @@ $ python ./Scripts/Exp/ForkFlow/calculate_forkflow.py
310
 
311
  - Results:
312
  ```
313
- $ cat ./Scripts/Exp/ForkFlow/mod_lines.csv
314
  ```
315
 
316
 
@@ -328,7 +331,7 @@ $ python ./Scripts/Exp/Correction/calculate_correction.py
328
 
329
  - Results:
330
  ```
331
- $ cat ./Scripts/Exp/Correction/Correction.csv
332
  ```
333
 
334
  ### 6.7 Results for Fig. 10
@@ -340,12 +343,12 @@ By executing the following script, the speedup for VEGA-generated LLVM backend (
340
 
341
  - Command:
342
  ```
343
- $ python ./Scripts/Exp/Performance/calculate_perf.py
344
  ```
345
 
346
  - Results:
347
  ```
348
- $ cat ./Scripts/Exp/Performance/Perf.csv
349
  ```
350
 
351
 
 
21
  ```
22
  VEGA_AE
23
  |──dataset
24
+ |──models
25
+ | |──FT_Model
26
+ | |──New_FT_Model
27
  | └──UnixCoder
28
  |β€”β€”Scripts
29
  |──Exp
 
47
 
48
  ## 4. Code Generation
49
 
50
+ We have provided a fine-tuned model using data from ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl``` in ```./models/FT_Model```. The ```train.jsonl``` and ```valid.jsonl``` files contain function templates, feature vectors and ground truth for 98 backends in our dataset.
51
 
52
  We have also provided a script fot functionality test, which only generates a single function for RI5CY (Recorded as PULP in our dataset), taking less than 3 minutes with 8 Nividia Tesla V100 GPUs.
53
 
 
66
  $ " Finished Function Inferencing."
67
  ```
68
 
69
+ The inference result will be saved in ```./models/FT_Model/result.jsonl```.
70
 
71
  Check the generated code with:
72
  ```
73
+ $ cat ./models/FT_Model/result.jsonl
74
  ```
75
 
76
  In the `result.jsonl` file, the meaning of each item in each entry corresponds as follows:
 
89
 
90
  - **Run code generation with:**
91
 
92
+
93
+ The fine-tuned model will take function templates and feature vectors for RISC-V, RI5CY, and xCORE from ```./dataset/test.jsonl``` as input, generating code and confidence scores automatically.
94
+
95
  ```
96
  $ bash run_test.sh
97
  ```
98
 
99
  Customize parameters for inferencing by modifying following options in the ```run_test.sh```.
100
  ```
101
+ --model_name_or_path ../../models/UnixCoder \
102
  --test_filename ../../dataset/test.jsonl \
103
+ --output_dir ../../models/FT_Model \
104
  --beam_size 4 \
105
  --train_batch_size 256 \
106
  --eval_batch_size 256 \
 
123
  $ " Finished Inferencing."
124
  ```
125
 
126
+ The inference result will be saved in ```./models/FT_Model/result.jsonl```.
127
 
128
 
129
 
 
132
 
133
  Fine-tune the original UnixCoder-base-nine with provided ```./dataset/train.jsonl``` and ```./dataset/valid.jsonl```:
134
 
135
+ We provide the original UnixCoder-base-nine in ```./models/UnixCoder```. The original UnixCoder-base-nine can also be downloaded from HuggingFace: https://huggingface.co/microsoft/unixcoder-base-nine.
136
 
137
  - **Run fine-tuning with:**
138
  ```
 
141
 
142
  Customize parameters for fine-tuning by modifying following options in the ```run_fine_tuning.sh```.
143
  ```
144
+ --model_name_or_path ../../models/UnixCoder \
145
  --train_filename ../../dataset/train.jsonl \
146
  --dev_filename ../../dataset/valid.jsonl \
147
+ --output_dir ../../models/New_FT_Model \
148
  --beam_size 4 \
149
  --train_batch_size 64 \
150
  --eval_batch_size 48 \
 
163
 
164
  | Script | Description | Output | Figure/Table |
165
  | ---------------------------------------------- | ----------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------ | -------------- |
166
+ | ./Scripts/Exp/Time/calculate_time.py | Calculate the time overhead. | ./Scripts/Exp/Time/Fig7.csv | Fig.7 |
167
+ | ./Scripts/Exp/Acc/calculate_accuracy.py | Calculate the function-level accuracy. | ./Scripts/Exp/Acc/Fig8_Acc.csv | Fig.8 |
168
+ | ./Scripts/Exp/Acc/calculate_purple.py | Calculate the the percentage of functions accurately synthesized from the statements of various existing targets (Purple Bar in Fig.8). | ./Scripts/Exp/Acc/Fig8_Purple.csv | Fig.8 |
169
+ | ./Scripts/Exp/Acc/calculate_accuracy.py | Calculate the percentage of three types of error. | ./Scripts/Exp/Acc/Table2.csv | Table.2 |
170
+ | ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the statement-level accracy of VEGA and ForkFlow. | ./Scripts/Exp/ForkFlow/Fig9.csv | Fig.9 |
171
+ | ./Scripts/Exp/ForkFlow/calculate_forkflow.py | Calculate the number of accurate statements of VEGA. | ./Scripts/Exp/ForkFlow/Table3.csv | Table.3 |
172
+ | ./Scripts/Exp/Correction/calculate_correction.py | Calculate time required by two developers to modify the VEGA-generated RISC-V backend. | ./Scripts/Exp/Correction/Table4.csv | Table. 4 |
173
+ | ./Scripts/Exp/Perf/calculate_perf.py | Calculate the speedup of LLVM-Base (-O3),and LLVM-VEGA (-O3) over LLVM-Base (-O0). | ./Scripts/Exp/Perf/Fig10.csv | Fig. 10 |
174
  ### 6.1 Results for Fig. 7
175
 
176
  In the code generation process, we set a batch size of 256 on 8 Nvidia Tesla V100 GPU (each with 16GB memory), meaning each batch contains 256 statements. Since each batch may include statements from different function modules, we did not directly measure the generation time for each function modules of three targets (RISC-V, RI5CY, xCORE) during execution. Instead, we calculated the average inference time of each batch (25 seconds) and then derived the inference time of each statement (25/256 seconds). With the total number of statements within each function module of each target, we subsequently calculated the total inference time required for each function module of each target.
 
184
 
185
  - Results:
186
  ```
187
+ $ cat ./Scripts/Exp/Time/Fig7.csv
188
  ```
189
 
190
  ### 6.2 Results for Fig. 8
 
198
 
199
  - Command:
200
  ```
201
+ $ cp ./models/FT_Model/result.jsonl ./Scripts/Exp/Acc
202
+ $ python ./Scripts/Exp/Acc/calculate_accuracy.py
203
  ```
204
 
205
  This script will automatically analyze the VEGA's output from "result.jsonl" and compare the generated code and confidence scores with the ground truth. Based on this comparison, it will determine whether each function is correct.
206
 
207
  - Accuracy Results:
208
  ```
209
+ $ cat ./Scripts/Exp/Acc/Fig8_Acc.csv
210
  ```
211
 
212
 
 
215
 
216
  - Command:
217
  ```
218
+ $ python ./Scripts/Exp/Acc/calculate_purple.py
219
  ```
220
 
221
 
222
  - Results:
223
  ```
224
+ $ cat ./Scripts/Exp/Acc/Fig8_Purple.csv
225
  ```
226
 
227
 
 
275
 
276
  - Command:
277
  ```
278
+ $ python ./Scripts/Exp/Acc/calculate_accuracy.py
279
  ```
280
 
281
 
282
  - Results:
283
  ```
284
+ $ cat ./Scripts/Exp/Acc/Table2.csv
285
  ```
286
 
287
 
 
297
 
298
  - Results:
299
  ```
300
+ $ cat ./Scripts/Exp/ForkFlow/Fig9.csv
301
  ```
302
 
303
  ### 6.5 Results for Table. 3
 
313
 
314
  - Results:
315
  ```
316
+ $ cat ./Scripts/Exp/ForkFlow/Table3.csv
317
  ```
318
 
319
 
 
331
 
332
  - Results:
333
  ```
334
+ $ cat ./Scripts/Exp/Correction/Table4.csv
335
  ```
336
 
337
  ### 6.7 Results for Fig. 10
 
343
 
344
  - Command:
345
  ```
346
+ $ python ./Scripts/Exp/Perf/calculate_perf.py
347
  ```
348
 
349
  - Results:
350
  ```
351
+ $ cat ./Scripts/Exp/Perf/Fig10.csv
352
  ```
353
 
354
 
Scripts/Exp/{Accuracy β†’ Acc}/Accurate_Func_Merged.csv RENAMED
File without changes
Scripts/Exp/{Accuracy β†’ Acc}/calculate_accuracy.py RENAMED
@@ -82,7 +82,7 @@ def calculate_accuracy():
82
 
83
  all_func_lis = list(set(all_func_lis))
84
 
85
- with open(folder+"/vega_result.csv", 'a', encoding='utf-8', newline="") as f:
86
  f_csv = csv.writer(f)
87
  avg_dic = {}
88
  all_dic = {}
@@ -136,7 +136,7 @@ def calculate_accuracy():
136
 
137
  if __name__ == '__main__':
138
  get_wrong_list()
139
- with open(folder+"/vega_result.csv", 'w', encoding='utf-8', newline="") as f:
140
  f_csv = csv.writer(f)
141
  f_csv.writerow(["Target", "Module", "Correct", "Total", "Accurate", "Inaccurate", "Confidence Scoreβ‰ˆ1.00", "Confidence Score in [0.50, 1.00)"])
142
  total_dic = calculate_accuracy()
@@ -166,7 +166,7 @@ if __name__ == '__main__':
166
  else:
167
  target_func_num_dic[k.split(" ")[0].lower()] += len(list(set(total_dic[k])))
168
 
169
- with open(folder+"/err_percentage.csv", 'w', encoding='utf-8', newline = "") as f:
170
  f_csv = csv.writer(f)
171
  for k in target_func_num_dic:
172
  #print(target_func_num_dic[k])
 
82
 
83
  all_func_lis = list(set(all_func_lis))
84
 
85
+ with open(folder+"/Fig8_Acc.csv", 'a', encoding='utf-8', newline="") as f:
86
  f_csv = csv.writer(f)
87
  avg_dic = {}
88
  all_dic = {}
 
136
 
137
  if __name__ == '__main__':
138
  get_wrong_list()
139
+ with open(folder+"/Fig8_Acc.csv", 'w', encoding='utf-8', newline="") as f:
140
  f_csv = csv.writer(f)
141
  f_csv.writerow(["Target", "Module", "Correct", "Total", "Accurate", "Inaccurate", "Confidence Scoreβ‰ˆ1.00", "Confidence Score in [0.50, 1.00)"])
142
  total_dic = calculate_accuracy()
 
166
  else:
167
  target_func_num_dic[k.split(" ")[0].lower()] += len(list(set(total_dic[k])))
168
 
169
+ with open(folder+"/Table2.csv", 'w', encoding='utf-8', newline = "") as f:
170
  f_csv = csv.writer(f)
171
  for k in target_func_num_dic:
172
  #print(target_func_num_dic[k])
Scripts/Exp/{Accuracy β†’ Acc}/calculate_purple.py RENAMED
@@ -30,7 +30,7 @@ def calculate_template():
30
  res_dic[" ".join([row[-1], row[0]]).lower()] = 1
31
  else:
32
  res_dic[" ".join([row[-1], row[0]]).lower()] += 1
33
- with open(folder+"/fig8_purple.csv", 'w', encoding='utf-8', newline="") as f:
34
  f_csv = csv.writer(f)
35
  for k in res_dic.keys():
36
  f_csv.writerow([k.split(' ')[0].replace("pulp", "ri5cy"), k.split(' ')[1], round(float(res_dic[k])/float(len(list(total_dic[k]))), 3)])
 
30
  res_dic[" ".join([row[-1], row[0]]).lower()] = 1
31
  else:
32
  res_dic[" ".join([row[-1], row[0]]).lower()] += 1
33
+ with open(folder+"/Fig8_Purple.csv", 'w', encoding='utf-8', newline="") as f:
34
  f_csv = csv.writer(f)
35
  for k in res_dic.keys():
36
  f_csv.writerow([k.split(' ')[0].replace("pulp", "ri5cy"), k.split(' ')[1], round(float(res_dic[k])/float(len(list(total_dic[k]))), 3)])
Scripts/Exp/{Accuracy β†’ Acc}/wrong_func_list_def.csv RENAMED
File without changes
Scripts/Exp/Correction/calculate_correction.py CHANGED
@@ -26,7 +26,7 @@ result_A = sum_time_by_name("/Dev_A.csv")
26
 
27
  result_B = sum_time_by_name("/Dev_B.csv")
28
 
29
- with open(folder+"/Correction.csv", mode='w', newline='', encoding='utf-8') as out_file:
30
  csv_writer = csv.writer(out_file)
31
  for k in result_A.keys():
32
  csv_writer.writerow(["Dev A", k, round(result_A[k]/3600.0, 2)])
 
26
 
27
  result_B = sum_time_by_name("/Dev_B.csv")
28
 
29
+ with open(folder+"/Table4.csv", mode='w', newline='', encoding='utf-8') as out_file:
30
  csv_writer = csv.writer(out_file)
31
  for k in result_A.keys():
32
  csv_writer.writerow(["Dev A", k, round(result_A[k]/3600.0, 2)])
Scripts/Exp/ForkFlow/calculate_forkflow.py CHANGED
@@ -153,7 +153,7 @@ def duplicate_data(tar):
153
  Mod_Result[module][1] += Mips_same
154
 
155
 
156
- with open(folder+"/forkflow_result.csv", 'a', encoding='utf-8', newline="") as f:
157
  f_csv = csv.writer(f)
158
  avg_vega = 0.0
159
  avg_mips = 0.0
@@ -163,7 +163,7 @@ def duplicate_data(tar):
163
  avg_mips += float(round(kv[1][1]*1.0 / kv[1][0], 3))
164
  f_csv.writerow([tar.replace("PULP", "RI5CY"), "Avg", round(avg_mips / len(Mod_Result), 3), round(avg_vega / len(Mod_Result), 3)])
165
 
166
- with open(folder+"/mod_lines.csv", 'a', encoding='utf-8', newline="") as f:
167
  f_csv = csv.writer(f)
168
  all_vega = 0
169
  all_mips = 0
@@ -177,11 +177,11 @@ def duplicate_data(tar):
177
 
178
  if __name__ == '__main__':
179
  get_wrong_list()
180
- with open(folder+"/forkflow_result.csv", 'w', encoding='utf-8', newline="") as f:
181
  f_csv = csv.writer(f)
182
  f_csv.writerow(["Target", "Module", "Fork_Acc", "VEGA_Acc"])
183
 
184
- with open(folder+"/mod_lines.csv", 'w', encoding='utf-8', newline="") as f:
185
  f_csv = csv.writer(f)
186
  f_csv.writerow(["Target", "Module", "VEGA_Accurate_Lines", "VEGA_Manual_Lines"])
187
 
 
153
  Mod_Result[module][1] += Mips_same
154
 
155
 
156
+ with open(folder+"/Fig9.csv", 'a', encoding='utf-8', newline="") as f:
157
  f_csv = csv.writer(f)
158
  avg_vega = 0.0
159
  avg_mips = 0.0
 
163
  avg_mips += float(round(kv[1][1]*1.0 / kv[1][0], 3))
164
  f_csv.writerow([tar.replace("PULP", "RI5CY"), "Avg", round(avg_mips / len(Mod_Result), 3), round(avg_vega / len(Mod_Result), 3)])
165
 
166
+ with open(folder+"/Table3.csv", 'a', encoding='utf-8', newline="") as f:
167
  f_csv = csv.writer(f)
168
  all_vega = 0
169
  all_mips = 0
 
177
 
178
  if __name__ == '__main__':
179
  get_wrong_list()
180
+ with open(folder+"/Fig9.csv", 'w', encoding='utf-8', newline="") as f:
181
  f_csv = csv.writer(f)
182
  f_csv.writerow(["Target", "Module", "Fork_Acc", "VEGA_Acc"])
183
 
184
+ with open(folder+"/Table3.csv", 'w', encoding='utf-8', newline="") as f:
185
  f_csv = csv.writer(f)
186
  f_csv.writerow(["Target", "Module", "VEGA_Accurate_Lines", "VEGA_Manual_Lines"])
187
 
Scripts/Exp/{Performance β†’ Perf}/LLVM-RI5CY.csv RENAMED
File without changes
Scripts/Exp/{Performance β†’ Perf}/LLVM-RISCV.csv RENAMED
File without changes
Scripts/Exp/{Performance β†’ Perf}/LLVM-xCORE.csv RENAMED
File without changes
Scripts/Exp/{Performance β†’ Perf}/calculate_perf.py RENAMED
@@ -5,7 +5,7 @@ import time
5
 
6
  folder = str(pathlib.Path(__file__).parent.resolve())
7
 
8
- with open(folder+"/Perf.csv", mode='w', newline='', encoding='utf-8') as out_file:
9
  csv_writer = csv.writer(out_file)
10
  csv_writer.writerow(["Target", "Case", "LLVM-Base", "LLVM-VEGA"])
11
 
 
5
 
6
  folder = str(pathlib.Path(__file__).parent.resolve())
7
 
8
+ with open(folder+"/Fig10.csv", mode='w', newline='', encoding='utf-8') as out_file:
9
  csv_writer = csv.writer(out_file)
10
  csv_writer.writerow(["Target", "Case", "LLVM-Base", "LLVM-VEGA"])
11
 
Scripts/Exp/Time/calculate_time.py CHANGED
@@ -27,7 +27,7 @@ def calculate_time():
27
  else:
28
  Target_Module[dic["Target"]+" "+dic["Module"]] += 1
29
  Func_Lis.append(dic["File"]+" "+dic["Func"])
30
- with open(folder+"/time_overhead.csv", "w",encoding="utf-8", newline = "") as f:
31
  writer = csv.writer(f)
32
  for kv in Target_Module.items():
33
  writer.writerow(kv[0].replace("PULP", "RI5CY").split(" ") + [kv[1], math.ceil(kv[1] * 25.0 / 256)])
 
27
  else:
28
  Target_Module[dic["Target"]+" "+dic["Module"]] += 1
29
  Func_Lis.append(dic["File"]+" "+dic["Func"])
30
+ with open(folder+"/Fig7.csv", "w",encoding="utf-8", newline = "") as f:
31
  writer = csv.writer(f)
32
  for kv in Target_Module.items():
33
  writer.writerow(kv[0].replace("PULP", "RI5CY").split(" ") + [kv[1], math.ceil(kv[1] * 25.0 / 256)])
Scripts/UnixCoder/run_one_model.py CHANGED
@@ -372,6 +372,7 @@ def vega_train_main():
372
  set_seed(args.seed)
373
 
374
  # make dir if output_dir not exist
 
375
  if os.path.exists(args.output_dir) is False:
376
  os.makedirs(args.output_dir)
377
  args.model_name_or_path = folder + "/" + args.model_name_or_path
@@ -381,7 +382,6 @@ def vega_train_main():
381
  args.dev_filename = folder + "/" + args.dev_filename
382
  if args.test_filename:
383
  args.test_filename = folder + "/" + args.test_filename
384
- args.output_dir = folder + "/" + args.output_dir
385
  # build model
386
  tokenizer = RobertaTokenizer.from_pretrained(args.model_name_or_path)
387
  config = RobertaConfig.from_pretrained(args.model_name_or_path)
 
372
  set_seed(args.seed)
373
 
374
  # make dir if output_dir not exist
375
+ args.output_dir = folder + "/" + args.output_dir
376
  if os.path.exists(args.output_dir) is False:
377
  os.makedirs(args.output_dir)
378
  args.model_name_or_path = folder + "/" + args.model_name_or_path
 
382
  args.dev_filename = folder + "/" + args.dev_filename
383
  if args.test_filename:
384
  args.test_filename = folder + "/" + args.test_filename
 
385
  # build model
386
  tokenizer = RobertaTokenizer.from_pretrained(args.model_name_or_path)
387
  config = RobertaConfig.from_pretrained(args.model_name_or_path)
{saved_models/Fine_Tuned_Model β†’ models/FT_Model}/checkpoint-best-acc/pytorch_model.bin RENAMED
File without changes
{saved_models/New_Fine_Tuned_Model β†’ models/New_FT_Model}/.gitkeep RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/README.md RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/config.json RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/gitattributes.txt RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/merges.txt RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/pytorch_model.bin RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/special_tokens_map.json RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/tokenizer_config.json RENAMED
File without changes
{saved_models β†’ models}/UnixCoder/vocab.json RENAMED
File without changes
run_fine_tuning.sh CHANGED
@@ -2,10 +2,10 @@
2
  python ./Scripts/UnixCoder/run_one_model.py \
3
  --do_train \
4
  --do_eval \
5
- --model_name_or_path ../../saved_models/UnixCoder \
6
  --train_filename ../../dataset/train.jsonl \
7
  --dev_filename ../../dataset/valid.jsonl \
8
- --output_dir ../../saved_models/New_Fine_Tuned_Model \
9
  --beam_size 4 \
10
  --train_batch_size 64 \
11
  --eval_batch_size 48 \
 
2
  python ./Scripts/UnixCoder/run_one_model.py \
3
  --do_train \
4
  --do_eval \
5
+ --model_name_or_path ../../models/UnixCoder \
6
  --train_filename ../../dataset/train.jsonl \
7
  --dev_filename ../../dataset/valid.jsonl \
8
+ --output_dir ../../models/New_FT_Model \
9
  --beam_size 4 \
10
  --train_batch_size 64 \
11
  --eval_batch_size 48 \
run_function_test.sh CHANGED
@@ -2,9 +2,9 @@
2
  # do test
3
  python ./Scripts/UnixCoder/run_one_model.py \
4
  --do_function_test \
5
- --model_name_or_path ../../saved_models/UnixCoder \
6
  --test_filename ../../dataset/test.jsonl \
7
- --output_dir ../../saved_models/Fine_Tuned_Model \
8
  --beam_size 4 \
9
  --train_batch_size 256 \
10
  --eval_batch_size 256 \
 
2
  # do test
3
  python ./Scripts/UnixCoder/run_one_model.py \
4
  --do_function_test \
5
+ --model_name_or_path ../../models/UnixCoder \
6
  --test_filename ../../dataset/test.jsonl \
7
+ --output_dir ../../models/FT_Model \
8
  --beam_size 4 \
9
  --train_batch_size 256 \
10
  --eval_batch_size 256 \
run_test.sh CHANGED
@@ -2,9 +2,9 @@
2
  # do test
3
  python ./Scripts/UnixCoder/run_one_model.py \
4
  --do_test \
5
- --model_name_or_path ../../saved_models/UnixCoder \
6
  --test_filename ../../dataset/test.jsonl \
7
- --output_dir ../../saved_models/Fine_Tuned_Model \
8
  --beam_size 4 \
9
  --train_batch_size 256 \
10
  --eval_batch_size 256 \
 
2
  # do test
3
  python ./Scripts/UnixCoder/run_one_model.py \
4
  --do_test \
5
+ --model_name_or_path ../../models/UnixCoder \
6
  --test_filename ../../dataset/test.jsonl \
7
+ --output_dir ../../models/FT_Model \
8
  --beam_size 4 \
9
  --train_batch_size 256 \
10
  --eval_batch_size 256 \