File size: 13,220 Bytes
ff842a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61d727f
 
 
e3e19d3
ff842a9
61d727f
ff842a9
 
 
 
 
61d727f
 
 
e3e19d3
ff842a9
61d727f
ff842a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
61d727f
 
2d001b6
ff842a9
 
 
 
 
 
 
 
61d727f
 
35a3aa3
ff842a9
 
 
 
 
 
 
 
 
 
 
 
 
 
0fcb857
 
 
dbace8e
9af7a93
c665586
 
 
 
4bd8296
 
0fcb857
 
1930271
 
 
 
fb88b71
4bd8296
 
ff842a9
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
0fcb857
 
1930271
 
 
4bd8296
b4970e0
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
# ComBack: A Versatile Dataset for Enhancing Compiler Backend Development Efficiency

ComBack is a large-scale multi-platform compiler backend code dataset.
This repository contains all fine-tuned models for experiments with ComBack.

- language: C++/C
- metrics: Exact Match(EM), Edit Distance(ED), BLEU4
- tags: code; compiler backend
- license: CC-BY-4.0
  
## Task Information

  - Statement-Level Completion: complete current statement.
  ```c++
  //Inputs:
  ...
  adjustReg(MBB,LastFrameDestroy, DL, SPReg, FPReg, -StackSize+RVFI->getVarArgsSaveSize() 
  //Ground Truth:
  MachineInstr::FrameDestroy);
  ```

  - Next-Statement Suggestion: predict the next statement.

   ```c++
  //Inputs:
  ...
  maxCallFrameSize = (maxCallFrameSize + AlignMask) & ~AlignMask;
  //Ground Truth:
  MFI -> setMaxCallFrameSize(maxCallFrameSize);
  ```


  - Code Generation: generate a function with function description in natrual language.

   ```c++
  //Inputs:
  getPointerRegClass: Returns a TargetRegisterClass used for pointer values.
  Target-Specific Value: Sparc, SP::I64RegsRegClass, SP::IntRegsRegClass.
  //Ground Truth:
  TargetRegisterClass *SparcRegisterInfo::getPointerRegClass(MachineFunction &MF ,unsigned Kind) {
      return Subtarget.is64Bit() ? &SP::I64RegsRegClass : &SP::IntRegsRegClass;
  }
  ```

## Organization

- `Existing_Targets/*`: **split data of all 178 backends into train/valid/test set in the ratio of 80%:10%:10%**
  - Dataset Info
  
  | Task | Train | Valid | Test |
  | ---- | ---- | ---- | ---- |
  | Statement-Level Comp. | 128,899(11.36M Token) | 16,112(1.43M Token)  | 16,113(1.43M Token) |
  | Next-Statement Sugg. | 173,052(15.69M Token) | 21,631(1.99M Token)   | 21,632(1.98M Token) |
  | Code Generation. | 36,236(5.10M Token) | 4,530(0.64M Token)  | 4,530(0.64M Token) |

  We fine-tuned six representative models across three tasks.

  - Without Fine-Tuning
    |               | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. |
    |-------------|:-----------------:|:-----------------:|:----------------:|:----------------:|:----------:|:----------:|
    |    **Model**    |         EM        |         ED        |        EM        |        ED        |    BLEU4   |     ED     |
    | CodeBert-c      |        0.00       |        0.97       |       0.00       |       1.31       |     0.00    |     0.44    |
    | GraphCodeBert-c |        0.00       |        0.35       |       0.00       |       0.54       |     0.00    |     2.41    |
    | UnixCoder-base-nine     |        0.07       |       27.56       |        15.93       |        29.11       |    0.00    |    31.81   |
    | CodeT5-base        |        0.65       |       21.45       |        7.23       |        23.50       |     0.00    |     13.57    |
    | NatGen        |        0.00       |       13.52       |       0.02       |       15.95      |     0.01    |     28.76    |
    | CodeT5+-220m       |        0.02       |        7.24       |       0.12       |       9.87       |    0.00    |    12.33   |

  - Fine-Tuned
    |               | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. |
    |-------------|:-----------------:|:-----------------:|:----------------:|:----------------:|:----------:|:----------:|
    | **Model**         |         EM        |         ED        |        EM        |        ED        |    BLEU4   |     ED     |
    | CodeBert-c      |       53.84       |       77.44       |       52.67      |       70.82      |     xxx    |     xxx    |
    | GraphCodeBert-c |       43.00       |       71.89       |       47.10      |       61.31      |     xxx    |     xxx    |
    | UnixCoder-base-nine     |     **67.84**     |     **85.06**     |        58.51       |        75.31       |    56.24   |    73.45   |
    | CodeT5-base        |       66.38       |       84.34       |        58.52       |        76.03       |     70.87    |     80.45    |
    | NatGen        |       67.47       |       84.83       |     **60.30**    |     **76.84**    |     71.73    |     81.39    |
    | CodeT5+-220m       |       66.93       |       84.45       |       59.57      |       76.41      |  **75.28** |  **82.95** |


- `New_Targets/All_Types/*`: **Take data of RISC-V,ARC,NVPTX both in GCC and LLVM as test set, split train/valid set in the ratio of 85%:15% of other 171(178 - 2*3 - 1) targets excluding RI5CY(RI5CY is custmoized based on RISCV)**


  - Datset Info


    | Task | Train | Valid | Test |
    | ---- | ---- | ---- | ---- |
    | Statement-Level Comp. | 114,016(10.20M Token) | 20,121(1.81M Token)  | 6,645(0.58M Token) |
    | Next-Statement Sugg. | 152,114(14.10M Token) | 26,844(2.49M Token)   | 9,313(0.83M Token) |
    | Code Generation. | 30,633(4.44M Token) | 5,406(0.79M Token)  | 2,819(0.37M Token) |

  
  We only fine-tuned CodeT5+ across three tasks, and compare it with ChatGPT-3.5-Turbo and Code-LLaMA-34B with similar input.

  - GCC

  |            | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. |
  |----------|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
  |            |    RISC-V   |    RISC-V   |     ARC     |     ARC     |    NVPTX    |    NVPTX    |    RISC-V   |    RISC-V   |     ARC     |     ARC     |    NVPTX    |    NVPTX    |   RISC-V   |   RISC-V   |     ARC    |     ARC    |    NVPTX   |    NVPTX   |
  | Model      |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |     BLEU4     |     ED     |     BLEU4     |     ED     |     BLEU4     |     ED     |
  | ChatGPT-3.5-Turbo    |    10.34    |    38.41    |    15.35    |    42.94    |    12.01    |    41.47    |     6.44    |     12.9    |     9.75    |    20.79    |     7.97    |    17.79    |    7.33    |    30.83   |    7.35    |    32.34   |    8.12    |    32.71   |
  | Code-LLaMA-34B |     0.41    |    19.07    |     0.85    |    16.77    |     0.56    |    18.22    |     1.58    |    13.54    |     2.66    |    17.95    |     2.47    |    16.59    |    9.38    |    35.53   |    11.06   |    37.15   |    8.24    |    33.00   |
  | CodeT5+-220m    |    **51.16**    |    **75.32**    |    **52.45**    |    **74.57**    |    **50.56**    |    **75.52**    |     **49.11**     |     **67.84**     |     **38.26**     |     **59.21**     |     **38.33**     |     **56.31**     |     **32.56**    |     **58.67**    |     **19.94**    |     **50.27**    |     **25.47**    |     **52.60**    |


  - LLVM

  |            | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. |
  |----------|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:----------:|:----------:|:----------:|:----------:|:----------:|:----------:|
  |            |    RISC-V   |    RISC-V   |     ARC     |     ARC     |    NVPTX    |    NVPTX    |    RISC-V   |    RISC-V   |     ARC     |     ARC     |    NVPTX    |    NVPTX    |   RISC-V   |   RISC-V   |     ARC    |     ARC    |    NVPTX   |    NVPTX   |
  | Model      |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |      EM     |      ED     |     BLEU4     |     ED     |     BLEU4     |     ED     |     BLEU4     |     ED     |
  | ChatGPT-3.5-Turbo    |     12.08     |     41.39     |     16.77     |     42.02     |     14.73     |     43.72     |     9.80     |    21.86    |    10.81    |    20.66    |    11.39    |    22.82    |    9.24    |    32.13   |    11.96   |    35.33   |    10.07   |    32.90    |
  | Code-LLaMA-34B |     0.45     |    17.61    |     0.61    |    17.21    |     0.99    |    17.23    |     1.75    |    15.04    |     0.42    |    11.27    |     2.42    |    16.25    |    6.92    |    32.54   |    8.95    |    38.22   |     8.20    |    34.16   |
  | CodeT5+-220m    |     **62.68**     |     **82.02**     |     **71.34**     |     **85.98**     |     **64.45**     |     **81.53**     |     **48.71**     |     **68.95**     |     **58.68**     |     **74.57**     |     **47.81**     |     **65.5**     |     **50.34**    |     **72.98**    |     **55.38**    |     **74.41**    |     **44.33**    |     **66.36**    |



- `New_Targets/CPU_Only/*`: **Take data of ARC,NVPTX both in GCC and LLVM as test set, split train/valid set in the ratio of 85%:15% of CPU targets excluding RISC-V and RI5CY**

  - Datset Info


    | Task | Train | Valid | Test |
    | ---- | ---- | ---- | ---- |
    | Statement-Level Comp. | 87,018(7.78M Token) | 15,357(1.37M Token)  | 2,764(0.26M Token) |
    | Next-Statement Sugg. | 113,684(10.65M Token) | 20,063(1.87M Token)   | 4,029(0.38M Token) |
    | Code Generation. | 21,184(3.14M Token) | 3,739(0.55M Token)  | 1,372(0.18M Token) |

We only fine-tuned CodeT5+ across three tasks, and compare it with accuracy of ARC(MPU) and NVPTX(GPU) in `New_Targets/All_Types/*`.

  - GCC

    |  | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. |
    |:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:----------:  |:----------:  |:----------:  |:----------:  |
    |  | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) |
    | Dataset | EM | ED | EM | ED | EM | ED | EM | ED | BLEU4 | ED | BLEU4 | ED |
    | -w GPU and MPU | 52.45 | 74.57 | 50.56 | 75.52 | 38.26 | 59.21 | 38.33 | 56.31 | 19.94 | 50.27 | 25.47 | 52.6 |
    | -w/o GPU and MPU | 50.53| 74.09 | 46.37 | 72.45 | 37.22 | 58.21 | 38.33 | 56.83 | 19.29 | 49.12 | 22.46 | 50.33 |
    |  **Decrease**  |  **1.92**  |  **0.48**  |  **4.19**  |  **3.07**  |  **1.04**  |  **1.00**  |  **0.00**  |  **-0.52**  |  **0.65**  |  **1.15**  |  **3.01**  |  **3.37**  |

  - LLVM
    |  | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. | Code. Gen. | Code. Gen. |
    |------------------  |:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:----------:  |:----------:  |:----------:  |:----------:  |
    |  | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) | ARC(MPU) | ARC(MPU) | NVPTX(GPU) | NVPTX(GPU) |
    | Dataset | EM | ED | EM | ED | EM | ED | EM | ED | BLEU4 | ED | BLEU4 | ED |
    | -w GPU and MPU | 71.34 | 85.98 | 64.45 | 81.53 | 58.68 | 74.57 | 47.81 | 65.50 | 55.38 | 74.41 | 44.33 | 66.36 |
    | -w/o GPU and MPU | 69.82 | 85.59 | 60.04 | 79.85 | 58.26 | 73.75 | 46.28 | 63.92 | 49.62 | 70.26 | 42.94 | 65.43 |
    |  **Decrease**  |  **1.52**  |  **0.39**  |  **4.41**  |  **1.68**  |  **0.42**  |  **0.82**  |  **1.53**  |  **1.58**  |  **5.76**  |  **4.15**  |  **1.39**  |  **0.93**  |

- `New_Targets/Itr_Expansion/*`: **Take data of RI5CY in LLVM as test set, split train/valid set in the ratio of 85%:15% of CPU targets excluding RISC-V and including RISC-V**

  - Datset Info
    - Excluding RISC-V

    | Task | Train | Valid | Test |
    | ---- | ---- | ---- | ---- |
    | Statement-Level Comp. | 87,018(7.78M Token) | 15,357(1.37M Token)  | 721(0.04M Token) |
    | Next-Statement Sugg. | 113,684(10.65M Token) | 20,063(1.87M Token)   | 1,035(0.06M Token) |
    | Code Generation. | 21,184(3.14M Token) | 3,739(0.55M Token)  | 219(0.02M Token) |

    - Including RISC-V
  
    | Task | Train | Valid | Test |
    | ---- | ---- | ---- | ---- |
    | Statement-Level Comp. | 90,316(8.06M Token) | 15,940(1.42M Token)  | 721(0.04M Token) |
    | Next-Statement Sugg. | 118,175(11.04M Token) | 20,856(1.94M Token)   | 1,035(0.06M Token) |
    | Code Generation. | 22,413(3.30M Token) | 3,957(0.58M Token)  | 219(0.02M Token) |

We only fine-tuned CodeT5+ across three tasks, and compare it with accuracy of RI5CY between excluding and including RISC-V in dataset.

  | | Stmt. Comp. | Stmt. Comp. | Next. Sugg. | Next. Sugg. | Code. Gen. | Code. Gen. |
  |:-----------:|:-----------:|:-----------:|:-----------:|:-----------:|:----------: |:----------: |
  | Dataset | EM | ED | EM | ED | BLEU4 | ED |
  | -w/o RISC-V | 66.16 | 83.79 | 57.29 | 74.73 | 54.41 | 75.41 |
  | -w RISC-V | 74.06 | 87.91 | 67.25 | 81.28 | 79.46 | 89.92 |
  | **Diff** | **7.90** | **4.12** | **9.96** | **6.55** | **25.05** | **14.51** |