rchan26 commited on
Commit
b4316fa
·
1 Parent(s): 8570c4f

End of training

Browse files
README.md ADDED
@@ -0,0 +1,654 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - generated_from_trainer
4
+ model-index:
5
+ - name: english-char-roberta
6
+ results: []
7
+ ---
8
+
9
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
10
+ should probably proofread and complete it, then remove this comment. -->
11
+
12
+ # english-char-roberta
13
+
14
+ This model is a fine-tuned version of [](https://huggingface.co/) on the None dataset.
15
+ It achieves the following results on the evaluation set:
16
+ - Loss: 0.9727
17
+
18
+ ## Model description
19
+
20
+ More information needed
21
+
22
+ ## Intended uses & limitations
23
+
24
+ More information needed
25
+
26
+ ## Training and evaluation data
27
+
28
+ More information needed
29
+
30
+ ## Training procedure
31
+
32
+ ### Training hyperparameters
33
+
34
+ The following hyperparameters were used during training:
35
+ - learning_rate: 5e-05
36
+ - train_batch_size: 128
37
+ - eval_batch_size: 8
38
+ - seed: 2023
39
+ - optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
40
+ - lr_scheduler_type: linear
41
+ - num_epochs: 600
42
+
43
+ ### Training results
44
+
45
+ | Training Loss | Epoch | Step | Validation Loss |
46
+ |:-------------:|:-----:|:------:|:---------------:|
47
+ | No log | 1.0 | 296 | 2.1694 |
48
+ | 2.3335 | 2.0 | 592 | 1.9121 |
49
+ | 2.3335 | 3.0 | 888 | nan |
50
+ | 1.8727 | 4.0 | 1184 | 1.7151 |
51
+ | 1.8727 | 5.0 | 1480 | nan |
52
+ | 1.7283 | 6.0 | 1776 | nan |
53
+ | 1.6465 | 7.0 | 2072 | nan |
54
+ | 1.6465 | 8.0 | 2368 | 1.5415 |
55
+ | 1.5863 | 9.0 | 2664 | 1.5241 |
56
+ | 1.5863 | 10.0 | 2960 | 1.4887 |
57
+ | 1.533 | 11.0 | 3256 | nan |
58
+ | 1.489 | 12.0 | 3552 | 1.4435 |
59
+ | 1.489 | 13.0 | 3848 | nan |
60
+ | 1.4625 | 14.0 | 4144 | nan |
61
+ | 1.4625 | 15.0 | 4440 | 1.4028 |
62
+ | 1.4227 | 16.0 | 4736 | 1.4030 |
63
+ | 1.3996 | 17.0 | 5032 | 1.4001 |
64
+ | 1.3996 | 18.0 | 5328 | 1.3668 |
65
+ | 1.3827 | 19.0 | 5624 | 1.3777 |
66
+ | 1.3827 | 20.0 | 5920 | 1.3362 |
67
+ | 1.3463 | 21.0 | 6216 | nan |
68
+ | 1.3301 | 22.0 | 6512 | 1.3348 |
69
+ | 1.3301 | 23.0 | 6808 | 1.3038 |
70
+ | 1.3086 | 24.0 | 7104 | nan |
71
+ | 1.3086 | 25.0 | 7400 | 1.2902 |
72
+ | 1.2899 | 26.0 | 7696 | 1.2801 |
73
+ | 1.2899 | 27.0 | 7992 | 1.2718 |
74
+ | 1.2698 | 28.0 | 8288 | nan |
75
+ | 1.2506 | 29.0 | 8584 | nan |
76
+ | 1.2506 | 30.0 | 8880 | nan |
77
+ | 1.2308 | 31.0 | 9176 | 1.2399 |
78
+ | 1.2308 | 32.0 | 9472 | 1.2281 |
79
+ | 1.2167 | 33.0 | 9768 | nan |
80
+ | 1.1985 | 34.0 | 10064 | 1.2161 |
81
+ | 1.1985 | 35.0 | 10360 | 1.2127 |
82
+ | 1.1814 | 36.0 | 10656 | 1.2137 |
83
+ | 1.1814 | 37.0 | 10952 | 1.2192 |
84
+ | 1.1689 | 38.0 | 11248 | 1.1866 |
85
+ | 1.1529 | 39.0 | 11544 | 1.1991 |
86
+ | 1.1529 | 40.0 | 11840 | 1.1889 |
87
+ | 1.1375 | 41.0 | 12136 | 1.1721 |
88
+ | 1.1375 | 42.0 | 12432 | nan |
89
+ | 1.1234 | 43.0 | 12728 | 1.1761 |
90
+ | 1.1169 | 44.0 | 13024 | 1.1509 |
91
+ | 1.1169 | 45.0 | 13320 | 1.1552 |
92
+ | 1.0957 | 46.0 | 13616 | nan |
93
+ | 1.0957 | 47.0 | 13912 | 1.1407 |
94
+ | 1.0902 | 48.0 | 14208 | nan |
95
+ | 1.071 | 49.0 | 14504 | nan |
96
+ | 1.071 | 50.0 | 14800 | 1.1222 |
97
+ | 1.0539 | 51.0 | 15096 | 1.1288 |
98
+ | 1.0539 | 52.0 | 15392 | 1.1106 |
99
+ | 1.047 | 53.0 | 15688 | 1.0977 |
100
+ | 1.047 | 54.0 | 15984 | 1.1069 |
101
+ | 1.0385 | 55.0 | 16280 | 1.1068 |
102
+ | 1.0192 | 56.0 | 16576 | 1.1082 |
103
+ | 1.0192 | 57.0 | 16872 | 1.0953 |
104
+ | 1.0235 | 58.0 | 17168 | 1.0878 |
105
+ | 1.0235 | 59.0 | 17464 | nan |
106
+ | 1.0035 | 60.0 | 17760 | 1.1033 |
107
+ | 0.995 | 61.0 | 18056 | nan |
108
+ | 0.995 | 62.0 | 18352 | 1.0726 |
109
+ | 0.9842 | 63.0 | 18648 | 1.0697 |
110
+ | 0.9842 | 64.0 | 18944 | 1.0595 |
111
+ | 0.9815 | 65.0 | 19240 | nan |
112
+ | 0.9616 | 66.0 | 19536 | 1.0476 |
113
+ | 0.9616 | 67.0 | 19832 | nan |
114
+ | 0.9519 | 68.0 | 20128 | 1.0512 |
115
+ | 0.9519 | 69.0 | 20424 | 1.0420 |
116
+ | 0.9447 | 70.0 | 20720 | nan |
117
+ | 0.9317 | 71.0 | 21016 | nan |
118
+ | 0.9317 | 72.0 | 21312 | 1.0456 |
119
+ | 0.9197 | 73.0 | 21608 | 1.0689 |
120
+ | 0.9197 | 74.0 | 21904 | 1.0197 |
121
+ | 0.9208 | 75.0 | 22200 | 1.0294 |
122
+ | 0.9208 | 76.0 | 22496 | nan |
123
+ | 0.9094 | 77.0 | 22792 | 1.0272 |
124
+ | 0.9071 | 78.0 | 23088 | 1.0377 |
125
+ | 0.9071 | 79.0 | 23384 | nan |
126
+ | 0.8927 | 80.0 | 23680 | 1.0265 |
127
+ | 0.8927 | 81.0 | 23976 | 1.0245 |
128
+ | 0.8947 | 82.0 | 24272 | 1.0276 |
129
+ | 0.877 | 83.0 | 24568 | 1.0099 |
130
+ | 0.877 | 84.0 | 24864 | 1.0220 |
131
+ | 0.877 | 85.0 | 25160 | 1.0201 |
132
+ | 0.877 | 86.0 | 25456 | 1.0304 |
133
+ | 0.8678 | 87.0 | 25752 | 1.0238 |
134
+ | 0.861 | 88.0 | 26048 | 1.0129 |
135
+ | 0.861 | 89.0 | 26344 | 1.0134 |
136
+ | 0.8432 | 90.0 | 26640 | 0.9939 |
137
+ | 0.8432 | 91.0 | 26936 | 1.0188 |
138
+ | 0.8402 | 92.0 | 27232 | 1.0161 |
139
+ | 0.8297 | 93.0 | 27528 | 0.9779 |
140
+ | 0.8297 | 94.0 | 27824 | nan |
141
+ | 0.8267 | 95.0 | 28120 | 0.9953 |
142
+ | 0.8267 | 96.0 | 28416 | 1.0285 |
143
+ | 0.8189 | 97.0 | 28712 | 1.0018 |
144
+ | 0.811 | 98.0 | 29008 | 0.9986 |
145
+ | 0.811 | 99.0 | 29304 | nan |
146
+ | 0.8115 | 100.0 | 29600 | 0.9952 |
147
+ | 0.8115 | 101.0 | 29896 | 0.9946 |
148
+ | 0.7998 | 102.0 | 30192 | 0.9928 |
149
+ | 0.7998 | 103.0 | 30488 | nan |
150
+ | 0.7938 | 104.0 | 30784 | 0.9826 |
151
+ | 0.7791 | 105.0 | 31080 | nan |
152
+ | 0.7791 | 106.0 | 31376 | 0.9994 |
153
+ | 0.7842 | 107.0 | 31672 | 0.9917 |
154
+ | 0.7842 | 108.0 | 31968 | 0.9733 |
155
+ | 0.7739 | 109.0 | 32264 | 0.9635 |
156
+ | 0.7785 | 110.0 | 32560 | nan |
157
+ | 0.7785 | 111.0 | 32856 | nan |
158
+ | 0.763 | 112.0 | 33152 | 0.9751 |
159
+ | 0.763 | 113.0 | 33448 | 0.9755 |
160
+ | 0.76 | 114.0 | 33744 | 0.9811 |
161
+ | 0.7535 | 115.0 | 34040 | 0.9721 |
162
+ | 0.7535 | 116.0 | 34336 | nan |
163
+ | 0.7508 | 117.0 | 34632 | 0.9724 |
164
+ | 0.7508 | 118.0 | 34928 | nan |
165
+ | 0.7435 | 119.0 | 35224 | 0.9651 |
166
+ | 0.7406 | 120.0 | 35520 | nan |
167
+ | 0.7406 | 121.0 | 35816 | 0.9757 |
168
+ | 0.737 | 122.0 | 36112 | 0.9749 |
169
+ | 0.737 | 123.0 | 36408 | 0.9645 |
170
+ | 0.7289 | 124.0 | 36704 | 0.9699 |
171
+ | 0.7213 | 125.0 | 37000 | 0.9518 |
172
+ | 0.7213 | 126.0 | 37296 | 0.9509 |
173
+ | 0.7199 | 127.0 | 37592 | nan |
174
+ | 0.7199 | 128.0 | 37888 | 0.9614 |
175
+ | 0.7177 | 129.0 | 38184 | 0.9617 |
176
+ | 0.7177 | 130.0 | 38480 | 0.9372 |
177
+ | 0.706 | 131.0 | 38776 | 0.9399 |
178
+ | 0.7033 | 132.0 | 39072 | 0.9760 |
179
+ | 0.7033 | 133.0 | 39368 | 0.9560 |
180
+ | 0.7033 | 134.0 | 39664 | nan |
181
+ | 0.7033 | 135.0 | 39960 | nan |
182
+ | 0.7016 | 136.0 | 40256 | 0.9353 |
183
+ | 0.6901 | 137.0 | 40552 | 0.9352 |
184
+ | 0.6901 | 138.0 | 40848 | nan |
185
+ | 0.685 | 139.0 | 41144 | 0.9540 |
186
+ | 0.685 | 140.0 | 41440 | 0.9476 |
187
+ | 0.6881 | 141.0 | 41736 | nan |
188
+ | 0.6806 | 142.0 | 42032 | 0.9471 |
189
+ | 0.6806 | 143.0 | 42328 | 0.9518 |
190
+ | 0.6714 | 144.0 | 42624 | 0.9480 |
191
+ | 0.6714 | 145.0 | 42920 | nan |
192
+ | 0.6791 | 146.0 | 43216 | nan |
193
+ | 0.6664 | 147.0 | 43512 | 0.9610 |
194
+ | 0.6664 | 148.0 | 43808 | 0.9393 |
195
+ | 0.6668 | 149.0 | 44104 | nan |
196
+ | 0.6668 | 150.0 | 44400 | 0.9379 |
197
+ | 0.6516 | 151.0 | 44696 | 0.9522 |
198
+ | 0.6516 | 152.0 | 44992 | 0.9376 |
199
+ | 0.6587 | 153.0 | 45288 | 0.9561 |
200
+ | 0.6542 | 154.0 | 45584 | 0.9402 |
201
+ | 0.6542 | 155.0 | 45880 | 0.9407 |
202
+ | 0.6528 | 156.0 | 46176 | 0.9189 |
203
+ | 0.6528 | 157.0 | 46472 | 0.9412 |
204
+ | 0.6472 | 158.0 | 46768 | 0.9149 |
205
+ | 0.6466 | 159.0 | 47064 | 0.9380 |
206
+ | 0.6466 | 160.0 | 47360 | nan |
207
+ | 0.6377 | 161.0 | 47656 | nan |
208
+ | 0.6377 | 162.0 | 47952 | 0.9129 |
209
+ | 0.6279 | 163.0 | 48248 | nan |
210
+ | 0.6306 | 164.0 | 48544 | 0.9599 |
211
+ | 0.6306 | 165.0 | 48840 | nan |
212
+ | 0.6294 | 166.0 | 49136 | 0.9258 |
213
+ | 0.6294 | 167.0 | 49432 | 0.9223 |
214
+ | 0.6272 | 168.0 | 49728 | 0.9343 |
215
+ | 0.6217 | 169.0 | 50024 | nan |
216
+ | 0.6217 | 170.0 | 50320 | 0.9355 |
217
+ | 0.615 | 171.0 | 50616 | 0.9207 |
218
+ | 0.615 | 172.0 | 50912 | 0.9416 |
219
+ | 0.6105 | 173.0 | 51208 | 0.9379 |
220
+ | 0.6125 | 174.0 | 51504 | 0.9292 |
221
+ | 0.6125 | 175.0 | 51800 | nan |
222
+ | 0.6093 | 176.0 | 52096 | 0.9355 |
223
+ | 0.6093 | 177.0 | 52392 | 0.9148 |
224
+ | 0.6125 | 178.0 | 52688 | 0.9372 |
225
+ | 0.6125 | 179.0 | 52984 | 0.9452 |
226
+ | 0.6014 | 180.0 | 53280 | 0.9502 |
227
+ | 0.6019 | 181.0 | 53576 | 0.9439 |
228
+ | 0.6019 | 182.0 | 53872 | 0.9296 |
229
+ | 0.6019 | 183.0 | 54168 | 0.9511 |
230
+ | 0.6019 | 184.0 | 54464 | 0.9334 |
231
+ | 0.595 | 185.0 | 54760 | 0.9232 |
232
+ | 0.5889 | 186.0 | 55056 | 0.9308 |
233
+ | 0.5889 | 187.0 | 55352 | 0.9282 |
234
+ | 0.5911 | 188.0 | 55648 | 0.9435 |
235
+ | 0.5911 | 189.0 | 55944 | 0.9055 |
236
+ | 0.5906 | 190.0 | 56240 | 0.9593 |
237
+ | 0.5848 | 191.0 | 56536 | 0.9223 |
238
+ | 0.5848 | 192.0 | 56832 | 0.9102 |
239
+ | 0.5843 | 193.0 | 57128 | 0.9307 |
240
+ | 0.5843 | 194.0 | 57424 | 0.9027 |
241
+ | 0.581 | 195.0 | 57720 | nan |
242
+ | 0.5745 | 196.0 | 58016 | 0.9130 |
243
+ | 0.5745 | 197.0 | 58312 | 0.9183 |
244
+ | 0.582 | 198.0 | 58608 | 0.9191 |
245
+ | 0.582 | 199.0 | 58904 | 0.9407 |
246
+ | 0.5718 | 200.0 | 59200 | 0.9138 |
247
+ | 0.5718 | 201.0 | 59496 | 0.9133 |
248
+ | 0.5652 | 202.0 | 59792 | 0.9254 |
249
+ | 0.5711 | 203.0 | 60088 | 0.9537 |
250
+ | 0.5711 | 204.0 | 60384 | 0.9254 |
251
+ | 0.5699 | 205.0 | 60680 | 0.9386 |
252
+ | 0.5699 | 206.0 | 60976 | 0.9180 |
253
+ | 0.5546 | 207.0 | 61272 | nan |
254
+ | 0.5596 | 208.0 | 61568 | 0.9212 |
255
+ | 0.5596 | 209.0 | 61864 | 0.9264 |
256
+ | 0.5583 | 210.0 | 62160 | 0.9168 |
257
+ | 0.5583 | 211.0 | 62456 | 0.9155 |
258
+ | 0.5499 | 212.0 | 62752 | nan |
259
+ | 0.5601 | 213.0 | 63048 | 0.9182 |
260
+ | 0.5601 | 214.0 | 63344 | nan |
261
+ | 0.5528 | 215.0 | 63640 | 0.9178 |
262
+ | 0.5528 | 216.0 | 63936 | 0.9050 |
263
+ | 0.5484 | 217.0 | 64232 | nan |
264
+ | 0.5413 | 218.0 | 64528 | 0.9190 |
265
+ | 0.5413 | 219.0 | 64824 | 0.9201 |
266
+ | 0.546 | 220.0 | 65120 | 0.9400 |
267
+ | 0.546 | 221.0 | 65416 | 0.9068 |
268
+ | 0.5402 | 222.0 | 65712 | 0.9277 |
269
+ | 0.5407 | 223.0 | 66008 | nan |
270
+ | 0.5407 | 224.0 | 66304 | 0.9243 |
271
+ | 0.5341 | 225.0 | 66600 | 0.9355 |
272
+ | 0.5341 | 226.0 | 66896 | nan |
273
+ | 0.535 | 227.0 | 67192 | 0.9639 |
274
+ | 0.535 | 228.0 | 67488 | 0.9037 |
275
+ | 0.5364 | 229.0 | 67784 | 0.9277 |
276
+ | 0.5231 | 230.0 | 68080 | 0.9188 |
277
+ | 0.5231 | 231.0 | 68376 | 0.9140 |
278
+ | 0.523 | 232.0 | 68672 | 0.9234 |
279
+ | 0.523 | 233.0 | 68968 | 0.9507 |
280
+ | 0.5241 | 234.0 | 69264 | 0.9327 |
281
+ | 0.5258 | 235.0 | 69560 | nan |
282
+ | 0.5258 | 236.0 | 69856 | 0.9437 |
283
+ | 0.5224 | 237.0 | 70152 | 0.9478 |
284
+ | 0.5224 | 238.0 | 70448 | nan |
285
+ | 0.524 | 239.0 | 70744 | 0.9178 |
286
+ | 0.5233 | 240.0 | 71040 | nan |
287
+ | 0.5233 | 241.0 | 71336 | 0.8992 |
288
+ | 0.5166 | 242.0 | 71632 | 0.9349 |
289
+ | 0.5166 | 243.0 | 71928 | nan |
290
+ | 0.5228 | 244.0 | 72224 | 0.9484 |
291
+ | 0.5183 | 245.0 | 72520 | nan |
292
+ | 0.5183 | 246.0 | 72816 | nan |
293
+ | 0.5119 | 247.0 | 73112 | 0.9126 |
294
+ | 0.5119 | 248.0 | 73408 | 0.9244 |
295
+ | 0.5084 | 249.0 | 73704 | 0.9210 |
296
+ | 0.5107 | 250.0 | 74000 | 0.9404 |
297
+ | 0.5107 | 251.0 | 74296 | 0.9317 |
298
+ | 0.5072 | 252.0 | 74592 | 0.9537 |
299
+ | 0.5072 | 253.0 | 74888 | 0.9472 |
300
+ | 0.4958 | 254.0 | 75184 | 0.9127 |
301
+ | 0.4958 | 255.0 | 75480 | 0.9189 |
302
+ | 0.4982 | 256.0 | 75776 | 0.9286 |
303
+ | 0.4997 | 257.0 | 76072 | 0.9322 |
304
+ | 0.4997 | 258.0 | 76368 | 0.9080 |
305
+ | 0.5144 | 259.0 | 76664 | 0.9104 |
306
+ | 0.5144 | 260.0 | 76960 | 0.9413 |
307
+ | 0.4942 | 261.0 | 77256 | 0.9330 |
308
+ | 0.496 | 262.0 | 77552 | 0.9327 |
309
+ | 0.496 | 263.0 | 77848 | 0.9387 |
310
+ | 0.4903 | 264.0 | 78144 | 0.9330 |
311
+ | 0.4903 | 265.0 | 78440 | 0.9496 |
312
+ | 0.4991 | 266.0 | 78736 | 0.9340 |
313
+ | 0.4809 | 267.0 | 79032 | nan |
314
+ | 0.4809 | 268.0 | 79328 | 0.9367 |
315
+ | 0.4929 | 269.0 | 79624 | 0.9388 |
316
+ | 0.4929 | 270.0 | 79920 | 0.9259 |
317
+ | 0.4856 | 271.0 | 80216 | 0.9005 |
318
+ | 0.4865 | 272.0 | 80512 | 0.9265 |
319
+ | 0.4865 | 273.0 | 80808 | 0.9347 |
320
+ | 0.4889 | 274.0 | 81104 | 0.9299 |
321
+ | 0.4889 | 275.0 | 81400 | nan |
322
+ | 0.4841 | 276.0 | 81696 | 0.9292 |
323
+ | 0.4841 | 277.0 | 81992 | nan |
324
+ | 0.4748 | 278.0 | 82288 | 0.9155 |
325
+ | 0.4821 | 279.0 | 82584 | 0.9139 |
326
+ | 0.4821 | 280.0 | 82880 | 0.9324 |
327
+ | 0.4888 | 281.0 | 83176 | 0.9136 |
328
+ | 0.4888 | 282.0 | 83472 | nan |
329
+ | 0.4839 | 283.0 | 83768 | 0.9363 |
330
+ | 0.477 | 284.0 | 84064 | nan |
331
+ | 0.477 | 285.0 | 84360 | nan |
332
+ | 0.4784 | 286.0 | 84656 | 0.9179 |
333
+ | 0.4784 | 287.0 | 84952 | 0.9231 |
334
+ | 0.4709 | 288.0 | 85248 | nan |
335
+ | 0.4745 | 289.0 | 85544 | 0.9371 |
336
+ | 0.4745 | 290.0 | 85840 | 0.9337 |
337
+ | 0.4758 | 291.0 | 86136 | 0.9363 |
338
+ | 0.4758 | 292.0 | 86432 | 0.9385 |
339
+ | 0.4644 | 293.0 | 86728 | 0.9495 |
340
+ | 0.4664 | 294.0 | 87024 | 0.9344 |
341
+ | 0.4664 | 295.0 | 87320 | 0.9290 |
342
+ | 0.4618 | 296.0 | 87616 | nan |
343
+ | 0.4618 | 297.0 | 87912 | 0.9443 |
344
+ | 0.4636 | 298.0 | 88208 | nan |
345
+ | 0.4654 | 299.0 | 88504 | nan |
346
+ | 0.4654 | 300.0 | 88800 | 0.9582 |
347
+ | 0.4555 | 301.0 | 89096 | 0.9339 |
348
+ | 0.4555 | 302.0 | 89392 | 0.9333 |
349
+ | 0.4546 | 303.0 | 89688 | nan |
350
+ | 0.4546 | 304.0 | 89984 | nan |
351
+ | 0.4657 | 305.0 | 90280 | nan |
352
+ | 0.462 | 306.0 | 90576 | 0.9356 |
353
+ | 0.462 | 307.0 | 90872 | 0.9635 |
354
+ | 0.4549 | 308.0 | 91168 | 0.9442 |
355
+ | 0.4549 | 309.0 | 91464 | 0.9381 |
356
+ | 0.4514 | 310.0 | 91760 | nan |
357
+ | 0.4551 | 311.0 | 92056 | 0.9287 |
358
+ | 0.4551 | 312.0 | 92352 | nan |
359
+ | 0.4508 | 313.0 | 92648 | 0.9453 |
360
+ | 0.4508 | 314.0 | 92944 | 0.9312 |
361
+ | 0.4487 | 315.0 | 93240 | 0.9470 |
362
+ | 0.4484 | 316.0 | 93536 | 0.9548 |
363
+ | 0.4484 | 317.0 | 93832 | 0.9437 |
364
+ | 0.4445 | 318.0 | 94128 | 0.9265 |
365
+ | 0.4445 | 319.0 | 94424 | 0.9231 |
366
+ | 0.4486 | 320.0 | 94720 | 0.9492 |
367
+ | 0.4446 | 321.0 | 95016 | 0.9241 |
368
+ | 0.4446 | 322.0 | 95312 | 0.9256 |
369
+ | 0.4429 | 323.0 | 95608 | nan |
370
+ | 0.4429 | 324.0 | 95904 | 0.9394 |
371
+ | 0.4437 | 325.0 | 96200 | 0.9545 |
372
+ | 0.4437 | 326.0 | 96496 | 0.9416 |
373
+ | 0.4363 | 327.0 | 96792 | 0.9363 |
374
+ | 0.4393 | 328.0 | 97088 | 0.9140 |
375
+ | 0.4393 | 329.0 | 97384 | 0.9428 |
376
+ | 0.4409 | 330.0 | 97680 | 0.9394 |
377
+ | 0.4409 | 331.0 | 97976 | 0.9163 |
378
+ | 0.4407 | 332.0 | 98272 | 0.9323 |
379
+ | 0.4355 | 333.0 | 98568 | nan |
380
+ | 0.4355 | 334.0 | 98864 | 0.9358 |
381
+ | 0.4304 | 335.0 | 99160 | 0.9500 |
382
+ | 0.4304 | 336.0 | 99456 | 0.9299 |
383
+ | 0.4429 | 337.0 | 99752 | 0.9484 |
384
+ | 0.4357 | 338.0 | 100048 | 0.9476 |
385
+ | 0.4357 | 339.0 | 100344 | 0.9518 |
386
+ | 0.4375 | 340.0 | 100640 | 0.9357 |
387
+ | 0.4375 | 341.0 | 100936 | 0.9234 |
388
+ | 0.4401 | 342.0 | 101232 | nan |
389
+ | 0.4306 | 343.0 | 101528 | 0.9506 |
390
+ | 0.4306 | 344.0 | 101824 | 0.9309 |
391
+ | 0.4303 | 345.0 | 102120 | 0.9503 |
392
+ | 0.4303 | 346.0 | 102416 | 0.9406 |
393
+ | 0.4264 | 347.0 | 102712 | 0.9542 |
394
+ | 0.426 | 348.0 | 103008 | 0.9504 |
395
+ | 0.426 | 349.0 | 103304 | 0.9367 |
396
+ | 0.4212 | 350.0 | 103600 | nan |
397
+ | 0.4212 | 351.0 | 103896 | 0.9271 |
398
+ | 0.4252 | 352.0 | 104192 | 0.9554 |
399
+ | 0.4252 | 353.0 | 104488 | nan |
400
+ | 0.4196 | 354.0 | 104784 | 0.9406 |
401
+ | 0.4188 | 355.0 | 105080 | nan |
402
+ | 0.4188 | 356.0 | 105376 | nan |
403
+ | 0.4194 | 357.0 | 105672 | 0.9647 |
404
+ | 0.4194 | 358.0 | 105968 | 0.9296 |
405
+ | 0.4181 | 359.0 | 106264 | 0.9649 |
406
+ | 0.4192 | 360.0 | 106560 | 0.9635 |
407
+ | 0.4192 | 361.0 | 106856 | 0.9381 |
408
+ | 0.4242 | 362.0 | 107152 | 0.9330 |
409
+ | 0.4242 | 363.0 | 107448 | nan |
410
+ | 0.4192 | 364.0 | 107744 | 0.9286 |
411
+ | 0.42 | 365.0 | 108040 | 0.9379 |
412
+ | 0.42 | 366.0 | 108336 | nan |
413
+ | 0.4239 | 367.0 | 108632 | 0.9345 |
414
+ | 0.4239 | 368.0 | 108928 | 0.9639 |
415
+ | 0.4116 | 369.0 | 109224 | 0.9473 |
416
+ | 0.4172 | 370.0 | 109520 | nan |
417
+ | 0.4172 | 371.0 | 109816 | 0.9393 |
418
+ | 0.4114 | 372.0 | 110112 | nan |
419
+ | 0.4114 | 373.0 | 110408 | 0.9562 |
420
+ | 0.4166 | 374.0 | 110704 | 0.9416 |
421
+ | 0.4086 | 375.0 | 111000 | 0.9480 |
422
+ | 0.4086 | 376.0 | 111296 | 0.9436 |
423
+ | 0.4127 | 377.0 | 111592 | 0.9559 |
424
+ | 0.4127 | 378.0 | 111888 | 0.9381 |
425
+ | 0.4115 | 379.0 | 112184 | 0.9248 |
426
+ | 0.4115 | 380.0 | 112480 | 0.9257 |
427
+ | 0.4116 | 381.0 | 112776 | 0.9517 |
428
+ | 0.4023 | 382.0 | 113072 | nan |
429
+ | 0.4023 | 383.0 | 113368 | 0.9580 |
430
+ | 0.4032 | 384.0 | 113664 | 0.9285 |
431
+ | 0.4032 | 385.0 | 113960 | 0.9529 |
432
+ | 0.4052 | 386.0 | 114256 | 0.9461 |
433
+ | 0.403 | 387.0 | 114552 | 0.9712 |
434
+ | 0.403 | 388.0 | 114848 | 0.9551 |
435
+ | 0.4066 | 389.0 | 115144 | 0.9576 |
436
+ | 0.4066 | 390.0 | 115440 | nan |
437
+ | 0.3977 | 391.0 | 115736 | 0.9420 |
438
+ | 0.3994 | 392.0 | 116032 | nan |
439
+ | 0.3994 | 393.0 | 116328 | 0.9847 |
440
+ | 0.4017 | 394.0 | 116624 | 0.9627 |
441
+ | 0.4017 | 395.0 | 116920 | 0.9713 |
442
+ | 0.4045 | 396.0 | 117216 | 0.9635 |
443
+ | 0.3956 | 397.0 | 117512 | 0.9617 |
444
+ | 0.3956 | 398.0 | 117808 | 0.9500 |
445
+ | 0.3981 | 399.0 | 118104 | 0.9638 |
446
+ | 0.3981 | 400.0 | 118400 | 0.9536 |
447
+ | 0.3977 | 401.0 | 118696 | 0.9402 |
448
+ | 0.3977 | 402.0 | 118992 | nan |
449
+ | 0.3973 | 403.0 | 119288 | 0.9578 |
450
+ | 0.3926 | 404.0 | 119584 | 0.9201 |
451
+ | 0.3926 | 405.0 | 119880 | 0.9664 |
452
+ | 0.3918 | 406.0 | 120176 | 0.9447 |
453
+ | 0.3918 | 407.0 | 120472 | 0.9577 |
454
+ | 0.3955 | 408.0 | 120768 | 0.9549 |
455
+ | 0.3975 | 409.0 | 121064 | 0.9208 |
456
+ | 0.3975 | 410.0 | 121360 | nan |
457
+ | 0.3966 | 411.0 | 121656 | nan |
458
+ | 0.3966 | 412.0 | 121952 | 0.9895 |
459
+ | 0.3994 | 413.0 | 122248 | 0.9452 |
460
+ | 0.3885 | 414.0 | 122544 | nan |
461
+ | 0.3885 | 415.0 | 122840 | 0.9658 |
462
+ | 0.3964 | 416.0 | 123136 | nan |
463
+ | 0.3964 | 417.0 | 123432 | 0.9499 |
464
+ | 0.3906 | 418.0 | 123728 | 0.9538 |
465
+ | 0.3899 | 419.0 | 124024 | 0.9833 |
466
+ | 0.3899 | 420.0 | 124320 | 0.9493 |
467
+ | 0.3824 | 421.0 | 124616 | 0.9437 |
468
+ | 0.3824 | 422.0 | 124912 | 0.9457 |
469
+ | 0.3872 | 423.0 | 125208 | 0.9732 |
470
+ | 0.3855 | 424.0 | 125504 | 0.9371 |
471
+ | 0.3855 | 425.0 | 125800 | 0.9541 |
472
+ | 0.3857 | 426.0 | 126096 | 0.9619 |
473
+ | 0.3857 | 427.0 | 126392 | 0.9530 |
474
+ | 0.3822 | 428.0 | 126688 | 0.9561 |
475
+ | 0.3822 | 429.0 | 126984 | 0.9774 |
476
+ | 0.3849 | 430.0 | 127280 | 0.9647 |
477
+ | 0.383 | 431.0 | 127576 | 0.9550 |
478
+ | 0.383 | 432.0 | 127872 | 0.9736 |
479
+ | 0.3816 | 433.0 | 128168 | 0.9726 |
480
+ | 0.3816 | 434.0 | 128464 | nan |
481
+ | 0.3804 | 435.0 | 128760 | 0.9570 |
482
+ | 0.3823 | 436.0 | 129056 | 0.9570 |
483
+ | 0.3823 | 437.0 | 129352 | 0.9750 |
484
+ | 0.3846 | 438.0 | 129648 | 0.9612 |
485
+ | 0.3846 | 439.0 | 129944 | 0.9831 |
486
+ | 0.3789 | 440.0 | 130240 | 0.9610 |
487
+ | 0.38 | 441.0 | 130536 | 0.9536 |
488
+ | 0.38 | 442.0 | 130832 | 0.9641 |
489
+ | 0.3812 | 443.0 | 131128 | 0.9603 |
490
+ | 0.3812 | 444.0 | 131424 | 0.9615 |
491
+ | 0.37 | 445.0 | 131720 | 0.9609 |
492
+ | 0.3845 | 446.0 | 132016 | 0.9553 |
493
+ | 0.3845 | 447.0 | 132312 | 0.9761 |
494
+ | 0.3755 | 448.0 | 132608 | 0.9866 |
495
+ | 0.3755 | 449.0 | 132904 | nan |
496
+ | 0.3669 | 450.0 | 133200 | 0.9658 |
497
+ | 0.3669 | 451.0 | 133496 | 0.9700 |
498
+ | 0.3767 | 452.0 | 133792 | 0.9571 |
499
+ | 0.3709 | 453.0 | 134088 | nan |
500
+ | 0.3709 | 454.0 | 134384 | nan |
501
+ | 0.3767 | 455.0 | 134680 | 0.9465 |
502
+ | 0.3767 | 456.0 | 134976 | nan |
503
+ | 0.3733 | 457.0 | 135272 | 0.9578 |
504
+ | 0.3685 | 458.0 | 135568 | 0.9557 |
505
+ | 0.3685 | 459.0 | 135864 | nan |
506
+ | 0.3712 | 460.0 | 136160 | 0.9502 |
507
+ | 0.3712 | 461.0 | 136456 | 0.9458 |
508
+ | 0.3762 | 462.0 | 136752 | nan |
509
+ | 0.3645 | 463.0 | 137048 | 0.9776 |
510
+ | 0.3645 | 464.0 | 137344 | 0.9759 |
511
+ | 0.3675 | 465.0 | 137640 | nan |
512
+ | 0.3675 | 466.0 | 137936 | nan |
513
+ | 0.3686 | 467.0 | 138232 | nan |
514
+ | 0.3708 | 468.0 | 138528 | nan |
515
+ | 0.3708 | 469.0 | 138824 | 0.9732 |
516
+ | 0.3659 | 470.0 | 139120 | 0.9649 |
517
+ | 0.3659 | 471.0 | 139416 | nan |
518
+ | 0.3714 | 472.0 | 139712 | 0.9643 |
519
+ | 0.3652 | 473.0 | 140008 | 0.9713 |
520
+ | 0.3652 | 474.0 | 140304 | 0.9585 |
521
+ | 0.3632 | 475.0 | 140600 | 0.9675 |
522
+ | 0.3632 | 476.0 | 140896 | 0.9897 |
523
+ | 0.3622 | 477.0 | 141192 | nan |
524
+ | 0.3622 | 478.0 | 141488 | 0.9668 |
525
+ | 0.3517 | 479.0 | 141784 | nan |
526
+ | 0.3674 | 480.0 | 142080 | 0.9741 |
527
+ | 0.3674 | 481.0 | 142376 | 0.9635 |
528
+ | 0.3595 | 482.0 | 142672 | nan |
529
+ | 0.3595 | 483.0 | 142968 | nan |
530
+ | 0.3568 | 484.0 | 143264 | nan |
531
+ | 0.359 | 485.0 | 143560 | 0.9639 |
532
+ | 0.359 | 486.0 | 143856 | 0.9764 |
533
+ | 0.3613 | 487.0 | 144152 | nan |
534
+ | 0.3613 | 488.0 | 144448 | nan |
535
+ | 0.3569 | 489.0 | 144744 | 0.9616 |
536
+ | 0.3556 | 490.0 | 145040 | 0.9700 |
537
+ | 0.3556 | 491.0 | 145336 | nan |
538
+ | 0.3541 | 492.0 | 145632 | 0.9683 |
539
+ | 0.3541 | 493.0 | 145928 | 0.9779 |
540
+ | 0.3628 | 494.0 | 146224 | nan |
541
+ | 0.3535 | 495.0 | 146520 | 0.9489 |
542
+ | 0.3535 | 496.0 | 146816 | nan |
543
+ | 0.3539 | 497.0 | 147112 | 0.9848 |
544
+ | 0.3539 | 498.0 | 147408 | 0.9663 |
545
+ | 0.3601 | 499.0 | 147704 | 0.9565 |
546
+ | 0.3528 | 500.0 | 148000 | nan |
547
+ | 0.3528 | 501.0 | 148296 | nan |
548
+ | 0.3518 | 502.0 | 148592 | 0.9768 |
549
+ | 0.3518 | 503.0 | 148888 | nan |
550
+ | 0.3563 | 504.0 | 149184 | 0.9934 |
551
+ | 0.3563 | 505.0 | 149480 | 1.0003 |
552
+ | 0.3471 | 506.0 | 149776 | 0.9797 |
553
+ | 0.3544 | 507.0 | 150072 | nan |
554
+ | 0.3544 | 508.0 | 150368 | 0.9805 |
555
+ | 0.3588 | 509.0 | 150664 | nan |
556
+ | 0.3588 | 510.0 | 150960 | 0.9637 |
557
+ | 0.359 | 511.0 | 151256 | 0.9724 |
558
+ | 0.3497 | 512.0 | 151552 | 0.9661 |
559
+ | 0.3497 | 513.0 | 151848 | 0.9555 |
560
+ | 0.3535 | 514.0 | 152144 | 0.9618 |
561
+ | 0.3535 | 515.0 | 152440 | nan |
562
+ | 0.3406 | 516.0 | 152736 | 0.9747 |
563
+ | 0.3486 | 517.0 | 153032 | nan |
564
+ | 0.3486 | 518.0 | 153328 | nan |
565
+ | 0.3497 | 519.0 | 153624 | 0.9851 |
566
+ | 0.3497 | 520.0 | 153920 | 0.9804 |
567
+ | 0.3511 | 521.0 | 154216 | 0.9827 |
568
+ | 0.3492 | 522.0 | 154512 | 0.9741 |
569
+ | 0.3492 | 523.0 | 154808 | nan |
570
+ | 0.3482 | 524.0 | 155104 | 0.9678 |
571
+ | 0.3482 | 525.0 | 155400 | 0.9912 |
572
+ | 0.3412 | 526.0 | 155696 | 0.9408 |
573
+ | 0.3412 | 527.0 | 155992 | 0.9887 |
574
+ | 0.3414 | 528.0 | 156288 | 0.9669 |
575
+ | 0.347 | 529.0 | 156584 | 0.9694 |
576
+ | 0.347 | 530.0 | 156880 | 0.9731 |
577
+ | 0.3463 | 531.0 | 157176 | 0.9726 |
578
+ | 0.3463 | 532.0 | 157472 | 0.9880 |
579
+ | 0.3403 | 533.0 | 157768 | 0.9766 |
580
+ | 0.3481 | 534.0 | 158064 | 0.9835 |
581
+ | 0.3481 | 535.0 | 158360 | nan |
582
+ | 0.3448 | 536.0 | 158656 | nan |
583
+ | 0.3448 | 537.0 | 158952 | 0.9853 |
584
+ | 0.3484 | 538.0 | 159248 | 0.9906 |
585
+ | 0.344 | 539.0 | 159544 | 0.9687 |
586
+ | 0.344 | 540.0 | 159840 | 0.9806 |
587
+ | 0.3431 | 541.0 | 160136 | 0.9662 |
588
+ | 0.3431 | 542.0 | 160432 | 0.9771 |
589
+ | 0.3428 | 543.0 | 160728 | 0.9585 |
590
+ | 0.3457 | 544.0 | 161024 | nan |
591
+ | 0.3457 | 545.0 | 161320 | nan |
592
+ | 0.337 | 546.0 | 161616 | 0.9601 |
593
+ | 0.337 | 547.0 | 161912 | 0.9868 |
594
+ | 0.3402 | 548.0 | 162208 | nan |
595
+ | 0.3373 | 549.0 | 162504 | 0.9707 |
596
+ | 0.3373 | 550.0 | 162800 | 0.9949 |
597
+ | 0.3436 | 551.0 | 163096 | 0.9685 |
598
+ | 0.3436 | 552.0 | 163392 | 0.9804 |
599
+ | 0.3394 | 553.0 | 163688 | 0.9959 |
600
+ | 0.3394 | 554.0 | 163984 | 0.9838 |
601
+ | 0.3358 | 555.0 | 164280 | 0.9677 |
602
+ | 0.3412 | 556.0 | 164576 | nan |
603
+ | 0.3412 | 557.0 | 164872 | 0.9512 |
604
+ | 0.3426 | 558.0 | 165168 | 0.9678 |
605
+ | 0.3426 | 559.0 | 165464 | nan |
606
+ | 0.3339 | 560.0 | 165760 | nan |
607
+ | 0.3395 | 561.0 | 166056 | 0.9697 |
608
+ | 0.3395 | 562.0 | 166352 | nan |
609
+ | 0.3344 | 563.0 | 166648 | 0.9724 |
610
+ | 0.3344 | 564.0 | 166944 | nan |
611
+ | 0.34 | 565.0 | 167240 | 0.9977 |
612
+ | 0.3324 | 566.0 | 167536 | 0.9808 |
613
+ | 0.3324 | 567.0 | 167832 | 1.0174 |
614
+ | 0.3319 | 568.0 | 168128 | nan |
615
+ | 0.3319 | 569.0 | 168424 | 0.9758 |
616
+ | 0.3238 | 570.0 | 168720 | 0.9683 |
617
+ | 0.3291 | 571.0 | 169016 | nan |
618
+ | 0.3291 | 572.0 | 169312 | 0.9676 |
619
+ | 0.3319 | 573.0 | 169608 | 0.9721 |
620
+ | 0.3319 | 574.0 | 169904 | 0.9797 |
621
+ | 0.3349 | 575.0 | 170200 | 0.9471 |
622
+ | 0.3349 | 576.0 | 170496 | 0.9794 |
623
+ | 0.3344 | 577.0 | 170792 | 0.9909 |
624
+ | 0.331 | 578.0 | 171088 | 0.9861 |
625
+ | 0.331 | 579.0 | 171384 | nan |
626
+ | 0.3291 | 580.0 | 171680 | 0.9814 |
627
+ | 0.3291 | 581.0 | 171976 | 0.9804 |
628
+ | 0.3264 | 582.0 | 172272 | 0.9743 |
629
+ | 0.3251 | 583.0 | 172568 | 0.9636 |
630
+ | 0.3251 | 584.0 | 172864 | nan |
631
+ | 0.33 | 585.0 | 173160 | 0.9621 |
632
+ | 0.33 | 586.0 | 173456 | nan |
633
+ | 0.3304 | 587.0 | 173752 | 0.9753 |
634
+ | 0.3315 | 588.0 | 174048 | 0.9798 |
635
+ | 0.3315 | 589.0 | 174344 | nan |
636
+ | 0.3246 | 590.0 | 174640 | 0.9816 |
637
+ | 0.3246 | 591.0 | 174936 | 0.9924 |
638
+ | 0.3366 | 592.0 | 175232 | 0.9920 |
639
+ | 0.3302 | 593.0 | 175528 | 0.9693 |
640
+ | 0.3302 | 594.0 | 175824 | 0.9792 |
641
+ | 0.3328 | 595.0 | 176120 | nan |
642
+ | 0.3328 | 596.0 | 176416 | 0.9770 |
643
+ | 0.3338 | 597.0 | 176712 | 0.9668 |
644
+ | 0.3293 | 598.0 | 177008 | 0.9965 |
645
+ | 0.3293 | 599.0 | 177304 | 0.9825 |
646
+ | 0.3275 | 600.0 | 177600 | 0.9727 |
647
+
648
+
649
+ ### Framework versions
650
+
651
+ - Transformers 4.34.0
652
+ - Pytorch 2.1.0
653
+ - Datasets 2.14.5
654
+ - Tokenizers 0.14.1
added_tokens.json ADDED
@@ -0,0 +1,7 @@
 
 
 
 
 
 
 
 
1
+ {
2
+ "</s>": 1,
3
+ "<mask>": 4,
4
+ "<pad>": 3,
5
+ "<s>": 0,
6
+ "<unk>": 2
7
+ }
config.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "RobertaForMaskedLM"
4
+ ],
5
+ "attention_probs_dropout_prob": 0.1,
6
+ "bos_token_id": 0,
7
+ "classifier_dropout": null,
8
+ "eos_token_id": 2,
9
+ "hidden_act": "gelu",
10
+ "hidden_dropout_prob": 0.1,
11
+ "hidden_size": 768,
12
+ "initializer_range": 0.02,
13
+ "intermediate_size": 3072,
14
+ "layer_norm_eps": 1e-12,
15
+ "max_length": 50,
16
+ "max_position_embeddings": 52,
17
+ "model_type": "roberta",
18
+ "num_attention_heads": 12,
19
+ "num_hidden_layers": 6,
20
+ "pad_token_id": 1,
21
+ "position_embedding_type": "absolute",
22
+ "torch_dtype": "float32",
23
+ "transformers_version": "4.34.0",
24
+ "type_vocab_size": 1,
25
+ "use_cache": true,
26
+ "vocab_size": 57
27
+ }
merges.txt ADDED
@@ -0,0 +1 @@
 
 
1
+ #version: 0.2
pytorch_model.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:405b00d092f4b9e0ffba7affe4d067f69257acd910aef34a9785bea244dd3221
3
+ size 172858354
special_tokens_map.json ADDED
@@ -0,0 +1,9 @@
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token": "<s>",
3
+ "cls_token": "<s>",
4
+ "eos_token": "</s>",
5
+ "mask_token": "<mask>",
6
+ "pad_token": "<pad>",
7
+ "sep_token": "</s>",
8
+ "unk_token": "<unk>"
9
+ }
tokenizer.json ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "version": "1.0",
3
+ "truncation": {
4
+ "direction": "Right",
5
+ "max_length": 50,
6
+ "strategy": "LongestFirst",
7
+ "stride": 0
8
+ },
9
+ "padding": null,
10
+ "added_tokens": [
11
+ {
12
+ "id": 0,
13
+ "content": "<s>",
14
+ "single_word": false,
15
+ "lstrip": false,
16
+ "rstrip": false,
17
+ "normalized": true,
18
+ "special": true
19
+ },
20
+ {
21
+ "id": 1,
22
+ "content": "</s>",
23
+ "single_word": false,
24
+ "lstrip": false,
25
+ "rstrip": false,
26
+ "normalized": true,
27
+ "special": true
28
+ },
29
+ {
30
+ "id": 2,
31
+ "content": "<unk>",
32
+ "single_word": false,
33
+ "lstrip": false,
34
+ "rstrip": false,
35
+ "normalized": true,
36
+ "special": true
37
+ },
38
+ {
39
+ "id": 3,
40
+ "content": "<pad>",
41
+ "single_word": false,
42
+ "lstrip": false,
43
+ "rstrip": false,
44
+ "normalized": true,
45
+ "special": true
46
+ },
47
+ {
48
+ "id": 4,
49
+ "content": "<mask>",
50
+ "single_word": false,
51
+ "lstrip": true,
52
+ "rstrip": false,
53
+ "normalized": false,
54
+ "special": true
55
+ }
56
+ ],
57
+ "normalizer": null,
58
+ "pre_tokenizer": {
59
+ "type": "ByteLevel",
60
+ "add_prefix_space": false,
61
+ "trim_offsets": true,
62
+ "use_regex": true
63
+ },
64
+ "post_processor": {
65
+ "type": "RobertaProcessing",
66
+ "sep": [
67
+ "</s>",
68
+ 1
69
+ ],
70
+ "cls": [
71
+ "<s>",
72
+ 0
73
+ ],
74
+ "trim_offsets": true,
75
+ "add_prefix_space": false
76
+ },
77
+ "decoder": {
78
+ "type": "ByteLevel",
79
+ "add_prefix_space": true,
80
+ "trim_offsets": true,
81
+ "use_regex": true
82
+ },
83
+ "model": {
84
+ "type": "BPE",
85
+ "dropout": null,
86
+ "unk_token": null,
87
+ "continuing_subword_prefix": "",
88
+ "end_of_word_suffix": "",
89
+ "fuse_unk": false,
90
+ "byte_fallback": false,
91
+ "vocab": {
92
+ "<s>": 0,
93
+ "</s>": 1,
94
+ "<unk>": 2,
95
+ "<pad>": 3,
96
+ "<mask>": 4,
97
+ "a": 5,
98
+ "b": 6,
99
+ "c": 7,
100
+ "d": 8,
101
+ "e": 9,
102
+ "f": 10,
103
+ "g": 11,
104
+ "h": 12,
105
+ "i": 13,
106
+ "j": 14,
107
+ "k": 15,
108
+ "l": 16,
109
+ "m": 17,
110
+ "n": 18,
111
+ "o": 19,
112
+ "p": 20,
113
+ "q": 21,
114
+ "r": 22,
115
+ "s": 23,
116
+ "t": 24,
117
+ "u": 25,
118
+ "v": 26,
119
+ "w": 27,
120
+ "x": 28,
121
+ "y": 29,
122
+ "z": 30,
123
+ "p</w>": 31,
124
+ "g</w>": 32,
125
+ "m</w>": 33,
126
+ "d</w>": 34,
127
+ "x</w>": 35,
128
+ "z</w>": 36,
129
+ "k</w>": 37,
130
+ "n</w>": 38,
131
+ "j</w>": 39,
132
+ "o</w>": 40,
133
+ "l</w>": 41,
134
+ "a</w>": 42,
135
+ "f</w>": 43,
136
+ "b</w>": 44,
137
+ "h</w>": 45,
138
+ "s</w>": 46,
139
+ "i</w>": 47,
140
+ "w</w>": 48,
141
+ "e</w>": 49,
142
+ "q</w>": 50,
143
+ "u</w>": 51,
144
+ "y</w>": 52,
145
+ "t</w>": 53,
146
+ "c</w>": 54,
147
+ "r</w>": 55,
148
+ "v</w>": 56
149
+ },
150
+ "merges": []
151
+ }
152
+ }
tokenizer_config.json ADDED
@@ -0,0 +1,59 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "add_prefix_space": false,
3
+ "added_tokens_decoder": {
4
+ "0": {
5
+ "content": "<s>",
6
+ "lstrip": false,
7
+ "normalized": true,
8
+ "rstrip": false,
9
+ "single_word": false,
10
+ "special": true
11
+ },
12
+ "1": {
13
+ "content": "</s>",
14
+ "lstrip": false,
15
+ "normalized": true,
16
+ "rstrip": false,
17
+ "single_word": false,
18
+ "special": true
19
+ },
20
+ "2": {
21
+ "content": "<unk>",
22
+ "lstrip": false,
23
+ "normalized": true,
24
+ "rstrip": false,
25
+ "single_word": false,
26
+ "special": true
27
+ },
28
+ "3": {
29
+ "content": "<pad>",
30
+ "lstrip": false,
31
+ "normalized": true,
32
+ "rstrip": false,
33
+ "single_word": false,
34
+ "special": true
35
+ },
36
+ "4": {
37
+ "content": "<mask>",
38
+ "lstrip": true,
39
+ "normalized": false,
40
+ "rstrip": false,
41
+ "single_word": false,
42
+ "special": true
43
+ }
44
+ },
45
+ "additional_special_tokens": [],
46
+ "bos_token": "<s>",
47
+ "clean_up_tokenization_spaces": true,
48
+ "cls_token": "<s>",
49
+ "eos_token": "</s>",
50
+ "errors": "replace",
51
+ "mask_token": "<mask>",
52
+ "max_len": 50,
53
+ "model_max_length": 50,
54
+ "pad_token": "<pad>",
55
+ "sep_token": "</s>",
56
+ "tokenizer_class": "RobertaTokenizer",
57
+ "trim_offsets": true,
58
+ "unk_token": "<unk>"
59
+ }
training_args.bin ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:2110dba07f9b082405f2eae986b7acfd213a36f05180abde565779b1cc9e6f9f
3
+ size 4536
vocab.json ADDED
@@ -0,0 +1 @@
 
 
1
+ {"<s>":0,"</s>":1,"<unk>":2,"<pad>":3,"<mask>":4,"a":5,"b":6,"c":7,"d":8,"e":9,"f":10,"g":11,"h":12,"i":13,"j":14,"k":15,"l":16,"m":17,"n":18,"o":19,"p":20,"q":21,"r":22,"s":23,"t":24,"u":25,"v":26,"w":27,"x":28,"y":29,"z":30,"p</w>":31,"g</w>":32,"m</w>":33,"d</w>":34,"x</w>":35,"z</w>":36,"k</w>":37,"n</w>":38,"j</w>":39,"o</w>":40,"l</w>":41,"a</w>":42,"f</w>":43,"b</w>":44,"h</w>":45,"s</w>":46,"i</w>":47,"w</w>":48,"e</w>":49,"q</w>":50,"u</w>":51,"y</w>":52,"t</w>":53,"c</w>":54,"r</w>":55,"v</w>":56}