makhataei commited on
Commit
b02f730
1 Parent(s): d8662be

End of training

Browse files
README.md CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
17
 
18
  This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
- - Loss: 2.4713
21
 
22
  ## Model description
23
 
@@ -36,7 +36,7 @@ More information needed
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
- - learning_rate: 5e-05
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
@@ -48,262 +48,262 @@ The following hyperparameters were used during training:
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:------:|:---------------:|
51
- | 0.6303 | 0.04 | 500 | 1.0524 |
52
- | 0.6072 | 0.08 | 1000 | 1.1925 |
53
- | 0.6577 | 0.12 | 1500 | 0.9827 |
54
- | 0.6057 | 0.16 | 2000 | 1.0477 |
55
- | 0.6033 | 0.19 | 2500 | 1.0451 |
56
- | 0.6192 | 0.23 | 3000 | 1.0471 |
57
- | 0.6037 | 0.27 | 3500 | 1.0151 |
58
- | 0.6043 | 0.31 | 4000 | 1.0837 |
59
- | 0.5914 | 0.35 | 4500 | 1.1688 |
60
- | 0.5731 | 0.39 | 5000 | 1.1192 |
61
- | 0.6051 | 0.43 | 5500 | 1.0116 |
62
- | 0.5836 | 0.47 | 6000 | 1.0353 |
63
- | 0.6039 | 0.51 | 6500 | 1.0284 |
64
- | 0.6132 | 0.55 | 7000 | 1.0560 |
65
- | 0.5856 | 0.58 | 7500 | 1.1219 |
66
- | 0.5806 | 0.62 | 8000 | 1.0780 |
67
- | 0.6125 | 0.66 | 8500 | 0.9812 |
68
- | 0.6356 | 0.7 | 9000 | 1.0886 |
69
- | 0.5905 | 0.74 | 9500 | 1.0187 |
70
- | 0.6037 | 0.78 | 10000 | 1.1142 |
71
- | 0.5453 | 0.82 | 10500 | 1.1535 |
72
- | 0.6065 | 0.86 | 11000 | 1.0309 |
73
- | 0.5638 | 0.9 | 11500 | 1.1029 |
74
- | 0.5774 | 0.94 | 12000 | 1.1913 |
75
- | 0.6295 | 0.97 | 12500 | 1.1206 |
76
- | 0.5427 | 1.01 | 13000 | 1.2944 |
77
- | 0.468 | 1.05 | 13500 | 1.2117 |
78
- | 0.4965 | 1.09 | 14000 | 1.1481 |
79
- | 0.4549 | 1.13 | 14500 | 1.3192 |
80
- | 0.4811 | 1.17 | 15000 | 1.1514 |
81
- | 0.4507 | 1.21 | 15500 | 1.2152 |
82
- | 0.4845 | 1.25 | 16000 | 1.2142 |
83
- | 0.4529 | 1.29 | 16500 | 1.2136 |
84
- | 0.5138 | 1.32 | 17000 | 1.2248 |
85
- | 0.4843 | 1.36 | 17500 | 1.1198 |
86
- | 0.501 | 1.4 | 18000 | 1.1145 |
87
- | 0.4898 | 1.44 | 18500 | 1.0842 |
88
- | 0.505 | 1.48 | 19000 | 1.1711 |
89
- | 0.4979 | 1.52 | 19500 | 1.1617 |
90
- | 0.4836 | 1.56 | 20000 | 1.1651 |
91
- | 0.498 | 1.6 | 20500 | 1.1534 |
92
- | 0.51 | 1.64 | 21000 | 1.1642 |
93
- | 0.4648 | 1.68 | 21500 | 1.2294 |
94
- | 0.4955 | 1.71 | 22000 | 1.1252 |
95
- | 0.4645 | 1.75 | 22500 | 1.3847 |
96
- | 0.503 | 1.79 | 23000 | 1.1256 |
97
- | 0.5167 | 1.83 | 23500 | 1.2102 |
98
- | 0.5039 | 1.87 | 24000 | 1.2769 |
99
- | 0.4669 | 1.91 | 24500 | 1.2660 |
100
- | 0.5273 | 1.95 | 25000 | 1.1858 |
101
- | 0.5167 | 1.99 | 25500 | 1.0963 |
102
- | 0.4213 | 2.03 | 26000 | 1.3819 |
103
- | 0.4117 | 2.06 | 26500 | 1.3915 |
104
- | 0.4107 | 2.1 | 27000 | 1.2944 |
105
- | 0.3568 | 2.14 | 27500 | 1.4241 |
106
- | 0.3969 | 2.18 | 28000 | 1.4366 |
107
- | 0.4176 | 2.22 | 28500 | 1.4275 |
108
- | 0.4129 | 2.26 | 29000 | 1.3802 |
109
- | 0.4394 | 2.3 | 29500 | 1.3639 |
110
- | 0.3945 | 2.34 | 30000 | 1.4072 |
111
- | 0.3682 | 2.38 | 30500 | 1.3478 |
112
- | 0.3564 | 2.42 | 31000 | 1.4649 |
113
- | 0.3839 | 2.45 | 31500 | 1.6282 |
114
- | 0.4148 | 2.49 | 32000 | 1.4738 |
115
- | 0.3998 | 2.53 | 32500 | 1.4206 |
116
- | 0.3987 | 2.57 | 33000 | 1.3601 |
117
- | 0.3881 | 2.61 | 33500 | 1.4010 |
118
- | 0.3852 | 2.65 | 34000 | 1.3936 |
119
- | 0.3703 | 2.69 | 34500 | 1.4996 |
120
- | 0.3955 | 2.73 | 35000 | 1.4243 |
121
- | 0.414 | 2.77 | 35500 | 1.3599 |
122
- | 0.3723 | 2.81 | 36000 | 1.4481 |
123
- | 0.3927 | 2.84 | 36500 | 1.4327 |
124
- | 0.3993 | 2.88 | 37000 | 1.3312 |
125
- | 0.4074 | 2.92 | 37500 | 1.3248 |
126
- | 0.4978 | 2.96 | 38000 | 1.2219 |
127
- | 0.4957 | 3.0 | 38500 | 1.1998 |
128
- | 0.3192 | 3.04 | 39000 | 1.5531 |
129
- | 0.385 | 3.08 | 39500 | 1.3462 |
130
- | 0.351 | 3.12 | 40000 | 1.3456 |
131
- | 0.3584 | 3.16 | 40500 | 1.4219 |
132
- | 0.3696 | 3.19 | 41000 | 1.5244 |
133
- | 0.3872 | 3.23 | 41500 | 1.5260 |
134
- | 0.3916 | 3.27 | 42000 | 1.3642 |
135
- | 0.3598 | 3.31 | 42500 | 1.5210 |
136
- | 0.3749 | 3.35 | 43000 | 1.3730 |
137
- | 0.3781 | 3.39 | 43500 | 1.3904 |
138
- | 0.38 | 3.43 | 44000 | 1.3847 |
139
- | 0.4019 | 3.47 | 44500 | 1.3194 |
140
- | 0.3876 | 3.51 | 45000 | 1.4494 |
141
- | 0.3916 | 3.55 | 45500 | 1.5578 |
142
- | 0.3895 | 3.58 | 46000 | 1.4429 |
143
- | 0.3647 | 3.62 | 46500 | 1.3499 |
144
- | 0.3848 | 3.66 | 47000 | 1.4542 |
145
- | 0.3748 | 3.7 | 47500 | 1.2933 |
146
- | 0.3892 | 3.74 | 48000 | 1.3987 |
147
- | 0.3807 | 3.78 | 48500 | 1.4392 |
148
- | 0.4057 | 3.82 | 49000 | 1.3771 |
149
- | 0.3922 | 3.86 | 49500 | 1.3830 |
150
- | 0.3976 | 3.9 | 50000 | 1.2871 |
151
- | 0.383 | 3.94 | 50500 | 1.4306 |
152
- | 0.3771 | 3.97 | 51000 | 1.3849 |
153
- | 0.3793 | 4.01 | 51500 | 1.5489 |
154
- | 0.2792 | 4.05 | 52000 | 1.5708 |
155
- | 0.2859 | 4.09 | 52500 | 1.5634 |
156
- | 0.2839 | 4.13 | 53000 | 1.6146 |
157
- | 0.3118 | 4.17 | 53500 | 1.5593 |
158
- | 0.3248 | 4.21 | 54000 | 1.5015 |
159
- | 0.2981 | 4.25 | 54500 | 1.5262 |
160
- | 0.2697 | 4.29 | 55000 | 1.6662 |
161
- | 0.2929 | 4.32 | 55500 | 1.6073 |
162
- | 0.3233 | 4.36 | 56000 | 1.4935 |
163
- | 0.2944 | 4.4 | 56500 | 1.5488 |
164
- | 0.3021 | 4.44 | 57000 | 1.5612 |
165
- | 0.3162 | 4.48 | 57500 | 1.6165 |
166
- | 0.337 | 4.52 | 58000 | 1.4389 |
167
- | 0.3071 | 4.56 | 58500 | 1.6181 |
168
- | 0.346 | 4.6 | 59000 | 1.5063 |
169
- | 0.3359 | 4.64 | 59500 | 1.5319 |
170
- | 0.283 | 4.68 | 60000 | 1.5716 |
171
- | 0.3184 | 4.71 | 60500 | 1.5787 |
172
- | 0.2911 | 4.75 | 61000 | 1.6882 |
173
- | 0.3325 | 4.79 | 61500 | 1.5195 |
174
- | 0.3223 | 4.83 | 62000 | 1.6573 |
175
- | 0.3225 | 4.87 | 62500 | 1.4265 |
176
- | 0.3028 | 4.91 | 63000 | 1.5742 |
177
- | 0.318 | 4.95 | 63500 | 1.5170 |
178
- | 0.3047 | 4.99 | 64000 | 1.5051 |
179
- | 0.2552 | 5.03 | 64500 | 1.7450 |
180
- | 0.2326 | 5.07 | 65000 | 1.6757 |
181
- | 0.2174 | 5.1 | 65500 | 1.9674 |
182
- | 0.2423 | 5.14 | 66000 | 1.8576 |
183
- | 0.2066 | 5.18 | 66500 | 1.7914 |
184
- | 0.2717 | 5.22 | 67000 | 1.8060 |
185
- | 0.2353 | 5.26 | 67500 | 1.7933 |
186
- | 0.2499 | 5.3 | 68000 | 1.7655 |
187
- | 0.2415 | 5.34 | 68500 | 1.9094 |
188
- | 0.2541 | 5.38 | 69000 | 1.7136 |
189
- | 0.2616 | 5.42 | 69500 | 1.7428 |
190
- | 0.2402 | 5.45 | 70000 | 1.8088 |
191
- | 0.2543 | 5.49 | 70500 | 1.6588 |
192
- | 0.2756 | 5.53 | 71000 | 1.6765 |
193
- | 0.2447 | 5.57 | 71500 | 1.8263 |
194
- | 0.2643 | 5.61 | 72000 | 1.6329 |
195
- | 0.224 | 5.65 | 72500 | 1.7456 |
196
- | 0.2385 | 5.69 | 73000 | 1.7220 |
197
- | 0.2488 | 5.73 | 73500 | 1.5867 |
198
- | 0.2424 | 5.77 | 74000 | 1.7738 |
199
- | 0.2618 | 5.81 | 74500 | 1.7264 |
200
- | 0.2498 | 5.84 | 75000 | 1.6741 |
201
- | 0.264 | 5.88 | 75500 | 1.6714 |
202
- | 0.2364 | 5.92 | 76000 | 1.6306 |
203
- | 0.2377 | 5.96 | 76500 | 1.8281 |
204
- | 0.2475 | 6.0 | 77000 | 1.6020 |
205
- | 0.1762 | 6.04 | 77500 | 1.9256 |
206
- | 0.182 | 6.08 | 78000 | 1.9239 |
207
- | 0.1714 | 6.12 | 78500 | 1.9346 |
208
- | 0.1702 | 6.16 | 79000 | 2.0071 |
209
- | 0.1885 | 6.19 | 79500 | 1.8634 |
210
- | 0.1933 | 6.23 | 80000 | 2.0296 |
211
- | 0.1973 | 6.27 | 80500 | 1.8691 |
212
- | 0.1698 | 6.31 | 81000 | 1.9280 |
213
- | 0.1935 | 6.35 | 81500 | 1.9555 |
214
- | 0.1892 | 6.39 | 82000 | 1.9595 |
215
- | 0.1879 | 6.43 | 82500 | 1.9741 |
216
- | 0.1939 | 6.47 | 83000 | 2.0260 |
217
- | 0.1928 | 6.51 | 83500 | 2.0924 |
218
- | 0.1906 | 6.55 | 84000 | 1.9643 |
219
- | 0.1729 | 6.58 | 84500 | 2.1318 |
220
- | 0.2198 | 6.62 | 85000 | 1.8794 |
221
- | 0.1941 | 6.66 | 85500 | 1.9834 |
222
- | 0.1798 | 6.7 | 86000 | 2.0396 |
223
- | 0.2141 | 6.74 | 86500 | 1.8159 |
224
- | 0.1748 | 6.78 | 87000 | 2.0235 |
225
- | 0.2038 | 6.82 | 87500 | 1.9760 |
226
- | 0.1948 | 6.86 | 88000 | 1.9607 |
227
- | 0.209 | 6.9 | 88500 | 1.9526 |
228
- | 0.1951 | 6.94 | 89000 | 2.0364 |
229
- | 0.2238 | 6.97 | 89500 | 1.8029 |
230
- | 0.1913 | 7.01 | 90000 | 2.0869 |
231
- | 0.153 | 7.05 | 90500 | 2.1914 |
232
- | 0.1393 | 7.09 | 91000 | 2.2019 |
233
- | 0.145 | 7.13 | 91500 | 2.1408 |
234
- | 0.1483 | 7.17 | 92000 | 2.1024 |
235
- | 0.1396 | 7.21 | 92500 | 2.1224 |
236
- | 0.1313 | 7.25 | 93000 | 2.1517 |
237
- | 0.1288 | 7.29 | 93500 | 2.2002 |
238
- | 0.1569 | 7.32 | 94000 | 2.1955 |
239
- | 0.1291 | 7.36 | 94500 | 2.3081 |
240
- | 0.1702 | 7.4 | 95000 | 2.0735 |
241
- | 0.127 | 7.44 | 95500 | 2.0001 |
242
- | 0.1503 | 7.48 | 96000 | 2.1695 |
243
- | 0.1356 | 7.52 | 96500 | 2.1271 |
244
- | 0.1466 | 7.56 | 97000 | 2.0921 |
245
- | 0.1408 | 7.6 | 97500 | 2.1379 |
246
- | 0.1367 | 7.64 | 98000 | 2.0763 |
247
- | 0.1487 | 7.68 | 98500 | 2.2021 |
248
- | 0.1657 | 7.71 | 99000 | 2.0800 |
249
- | 0.1408 | 7.75 | 99500 | 2.1433 |
250
- | 0.1328 | 7.79 | 100000 | 2.0924 |
251
- | 0.1485 | 7.83 | 100500 | 2.1479 |
252
- | 0.1546 | 7.87 | 101000 | 2.0750 |
253
- | 0.1501 | 7.91 | 101500 | 2.0885 |
254
- | 0.1391 | 7.95 | 102000 | 2.1003 |
255
- | 0.173 | 7.99 | 102500 | 1.9603 |
256
- | 0.096 | 8.03 | 103000 | 2.2128 |
257
- | 0.0967 | 8.07 | 103500 | 2.2105 |
258
- | 0.0909 | 8.1 | 104000 | 2.2345 |
259
- | 0.086 | 8.14 | 104500 | 2.3129 |
260
- | 0.1052 | 8.18 | 105000 | 2.3452 |
261
- | 0.0975 | 8.22 | 105500 | 2.3279 |
262
- | 0.0875 | 8.26 | 106000 | 2.3719 |
263
- | 0.1167 | 8.3 | 106500 | 2.2740 |
264
- | 0.0724 | 8.34 | 107000 | 2.3902 |
265
- | 0.1067 | 8.38 | 107500 | 2.3961 |
266
- | 0.1017 | 8.42 | 108000 | 2.2360 |
267
- | 0.1003 | 8.45 | 108500 | 2.2271 |
268
- | 0.1113 | 8.49 | 109000 | 2.3305 |
269
- | 0.113 | 8.53 | 109500 | 2.2344 |
270
- | 0.1047 | 8.57 | 110000 | 2.2780 |
271
- | 0.0935 | 8.61 | 110500 | 2.3290 |
272
- | 0.1159 | 8.65 | 111000 | 2.3176 |
273
- | 0.0936 | 8.69 | 111500 | 2.3421 |
274
- | 0.0954 | 8.73 | 112000 | 2.2757 |
275
- | 0.1131 | 8.77 | 112500 | 2.2388 |
276
- | 0.0939 | 8.81 | 113000 | 2.3273 |
277
- | 0.1026 | 8.84 | 113500 | 2.2831 |
278
- | 0.0842 | 8.88 | 114000 | 2.3705 |
279
- | 0.1031 | 8.92 | 114500 | 2.3365 |
280
- | 0.1114 | 8.96 | 115000 | 2.2940 |
281
- | 0.1145 | 9.0 | 115500 | 2.2661 |
282
- | 0.0796 | 9.04 | 116000 | 2.3672 |
283
- | 0.0591 | 9.08 | 116500 | 2.5256 |
284
- | 0.0724 | 9.12 | 117000 | 2.4654 |
285
- | 0.0733 | 9.16 | 117500 | 2.4303 |
286
- | 0.0727 | 9.2 | 118000 | 2.5239 |
287
- | 0.0649 | 9.23 | 118500 | 2.4831 |
288
- | 0.0874 | 9.27 | 119000 | 2.4823 |
289
- | 0.0679 | 9.31 | 119500 | 2.5225 |
290
- | 0.0798 | 9.35 | 120000 | 2.4684 |
291
- | 0.0774 | 9.39 | 120500 | 2.4247 |
292
- | 0.0837 | 9.43 | 121000 | 2.3901 |
293
- | 0.077 | 9.47 | 121500 | 2.4002 |
294
- | 0.0591 | 9.51 | 122000 | 2.4534 |
295
- | 0.0598 | 9.55 | 122500 | 2.4878 |
296
- | 0.0662 | 9.58 | 123000 | 2.5026 |
297
- | 0.0716 | 9.62 | 123500 | 2.4876 |
298
- | 0.0744 | 9.66 | 124000 | 2.4856 |
299
- | 0.0759 | 9.7 | 124500 | 2.4703 |
300
- | 0.0713 | 9.74 | 125000 | 2.4614 |
301
- | 0.0687 | 9.78 | 125500 | 2.4629 |
302
- | 0.0706 | 9.82 | 126000 | 2.4621 |
303
- | 0.0702 | 9.86 | 126500 | 2.4521 |
304
- | 0.0609 | 9.9 | 127000 | 2.4698 |
305
- | 0.0782 | 9.94 | 127500 | 2.4702 |
306
- | 0.062 | 9.97 | 128000 | 2.4713 |
307
 
308
 
309
  ### Framework versions
 
17
 
18
  This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
19
  It achieves the following results on the evaluation set:
20
+ - Loss: 2.4944
21
 
22
  ## Model description
23
 
 
36
  ### Training hyperparameters
37
 
38
  The following hyperparameters were used during training:
39
+ - learning_rate: 2.5e-05
40
  - train_batch_size: 5
41
  - eval_batch_size: 5
42
  - seed: 42
 
48
 
49
  | Training Loss | Epoch | Step | Validation Loss |
50
  |:-------------:|:-----:|:------:|:---------------:|
51
+ | 0.4273 | 0.04 | 500 | 1.2636 |
52
+ | 0.3813 | 0.08 | 1000 | 1.2619 |
53
+ | 0.3933 | 0.12 | 1500 | 1.3319 |
54
+ | 0.3305 | 0.16 | 2000 | 1.4050 |
55
+ | 0.3604 | 0.19 | 2500 | 1.4107 |
56
+ | 0.3431 | 0.23 | 3000 | 1.3068 |
57
+ | 0.3258 | 0.27 | 3500 | 1.3487 |
58
+ | 0.3432 | 0.31 | 4000 | 1.4339 |
59
+ | 0.3429 | 0.35 | 4500 | 1.3738 |
60
+ | 0.3279 | 0.39 | 5000 | 1.4084 |
61
+ | 0.3268 | 0.43 | 5500 | 1.3671 |
62
+ | 0.3352 | 0.47 | 6000 | 1.3462 |
63
+ | 0.3233 | 0.51 | 6500 | 1.3703 |
64
+ | 0.3157 | 0.55 | 7000 | 1.4630 |
65
+ | 0.3005 | 0.58 | 7500 | 1.4817 |
66
+ | 0.2708 | 0.62 | 8000 | 1.4972 |
67
+ | 0.3227 | 0.66 | 8500 | 1.4029 |
68
+ | 0.6272 | 0.7 | 9000 | 1.0431 |
69
+ | 0.5573 | 0.74 | 9500 | 1.0920 |
70
+ | 0.5744 | 0.78 | 10000 | 1.0445 |
71
+ | 0.5128 | 0.82 | 10500 | 1.0858 |
72
+ | 0.5503 | 0.86 | 11000 | 1.0169 |
73
+ | 0.5128 | 0.9 | 11500 | 1.0771 |
74
+ | 0.5281 | 0.94 | 12000 | 1.0501 |
75
+ | 0.5347 | 0.97 | 12500 | 1.0867 |
76
+ | 0.4619 | 1.01 | 13000 | 1.2511 |
77
+ | 0.3691 | 1.05 | 13500 | 1.2085 |
78
+ | 0.3785 | 1.09 | 14000 | 1.2875 |
79
+ | 0.3309 | 1.13 | 14500 | 1.3301 |
80
+ | 0.3818 | 1.17 | 15000 | 1.2383 |
81
+ | 0.3546 | 1.21 | 15500 | 1.2575 |
82
+ | 0.3568 | 1.25 | 16000 | 1.3351 |
83
+ | 0.3475 | 1.29 | 16500 | 1.3030 |
84
+ | 0.3801 | 1.32 | 17000 | 1.3151 |
85
+ | 0.383 | 1.36 | 17500 | 1.2519 |
86
+ | 0.3878 | 1.4 | 18000 | 1.2408 |
87
+ | 0.3568 | 1.44 | 18500 | 1.2846 |
88
+ | 0.3901 | 1.48 | 19000 | 1.1482 |
89
+ | 0.3732 | 1.52 | 19500 | 1.2964 |
90
+ | 0.3585 | 1.56 | 20000 | 1.2875 |
91
+ | 0.3854 | 1.6 | 20500 | 1.2647 |
92
+ | 0.3802 | 1.64 | 21000 | 1.2905 |
93
+ | 0.3383 | 1.68 | 21500 | 1.3686 |
94
+ | 0.3809 | 1.71 | 22000 | 1.2277 |
95
+ | 0.3487 | 1.75 | 22500 | 1.3850 |
96
+ | 0.3704 | 1.79 | 23000 | 1.2682 |
97
+ | 0.3868 | 1.83 | 23500 | 1.3091 |
98
+ | 0.3772 | 1.87 | 24000 | 1.2671 |
99
+ | 0.3492 | 1.91 | 24500 | 1.3259 |
100
+ | 0.4124 | 1.95 | 25000 | 1.2334 |
101
+ | 0.3716 | 1.99 | 25500 | 1.2383 |
102
+ | 0.3068 | 2.03 | 26000 | 1.4346 |
103
+ | 0.2693 | 2.06 | 26500 | 1.5702 |
104
+ | 0.2776 | 2.1 | 27000 | 1.4791 |
105
+ | 0.2574 | 2.14 | 27500 | 1.5752 |
106
+ | 0.2764 | 2.18 | 28000 | 1.6362 |
107
+ | 0.3035 | 2.22 | 28500 | 1.5172 |
108
+ | 0.2961 | 2.26 | 29000 | 1.4787 |
109
+ | 0.3115 | 2.3 | 29500 | 1.5763 |
110
+ | 0.2846 | 2.34 | 30000 | 1.4942 |
111
+ | 0.2971 | 2.38 | 30500 | 1.4641 |
112
+ | 0.2448 | 2.42 | 31000 | 1.6608 |
113
+ | 0.2864 | 2.45 | 31500 | 1.5140 |
114
+ | 0.3112 | 2.49 | 32000 | 1.5064 |
115
+ | 0.2768 | 2.53 | 32500 | 1.6051 |
116
+ | 0.2938 | 2.57 | 33000 | 1.6976 |
117
+ | 0.2839 | 2.61 | 33500 | 1.4711 |
118
+ | 0.2675 | 2.65 | 34000 | 1.5766 |
119
+ | 0.273 | 2.69 | 34500 | 1.5526 |
120
+ | 0.2446 | 2.73 | 35000 | 1.6282 |
121
+ | 0.2921 | 2.77 | 35500 | 1.4750 |
122
+ | 0.2433 | 2.81 | 36000 | 1.5918 |
123
+ | 0.2634 | 2.84 | 36500 | 1.5804 |
124
+ | 0.2726 | 2.88 | 37000 | 1.5430 |
125
+ | 0.2678 | 2.92 | 37500 | 1.5456 |
126
+ | 0.3963 | 2.96 | 38000 | 1.4429 |
127
+ | 0.3874 | 3.0 | 38500 | 1.3743 |
128
+ | 0.2386 | 3.04 | 39000 | 1.6718 |
129
+ | 0.2666 | 3.08 | 39500 | 1.6247 |
130
+ | 0.2452 | 3.12 | 40000 | 1.6553 |
131
+ | 0.2684 | 3.16 | 40500 | 1.5948 |
132
+ | 0.2741 | 3.19 | 41000 | 1.6774 |
133
+ | 0.2915 | 3.23 | 41500 | 1.6423 |
134
+ | 0.289 | 3.27 | 42000 | 1.6159 |
135
+ | 0.2572 | 3.31 | 42500 | 1.6878 |
136
+ | 0.2888 | 3.35 | 43000 | 1.6022 |
137
+ | 0.2787 | 3.39 | 43500 | 1.6714 |
138
+ | 0.2762 | 3.43 | 44000 | 1.6734 |
139
+ | 0.304 | 3.47 | 44500 | 1.6225 |
140
+ | 0.2964 | 3.51 | 45000 | 1.6075 |
141
+ | 0.3047 | 3.55 | 45500 | 1.6200 |
142
+ | 0.2929 | 3.58 | 46000 | 1.5646 |
143
+ | 0.2828 | 3.62 | 46500 | 1.5764 |
144
+ | 0.2882 | 3.66 | 47000 | 1.6570 |
145
+ | 0.2756 | 3.7 | 47500 | 1.5030 |
146
+ | 0.2741 | 3.74 | 48000 | 1.6237 |
147
+ | 0.2819 | 3.78 | 48500 | 1.5456 |
148
+ | 0.3243 | 3.82 | 49000 | 1.5030 |
149
+ | 0.2999 | 3.86 | 49500 | 1.6339 |
150
+ | 0.2867 | 3.9 | 50000 | 1.6627 |
151
+ | 0.2834 | 3.94 | 50500 | 1.6580 |
152
+ | 0.2784 | 3.97 | 51000 | 1.6321 |
153
+ | 0.2846 | 4.01 | 51500 | 1.5986 |
154
+ | 0.2059 | 4.05 | 52000 | 1.7993 |
155
+ | 0.2204 | 4.09 | 52500 | 1.7942 |
156
+ | 0.2144 | 4.13 | 53000 | 1.7884 |
157
+ | 0.2385 | 4.17 | 53500 | 1.7064 |
158
+ | 0.2225 | 4.21 | 54000 | 1.7386 |
159
+ | 0.2119 | 4.25 | 54500 | 1.9515 |
160
+ | 0.2033 | 4.29 | 55000 | 1.8603 |
161
+ | 0.2121 | 4.32 | 55500 | 1.8144 |
162
+ | 0.2489 | 4.36 | 56000 | 1.7729 |
163
+ | 0.2284 | 4.4 | 56500 | 1.8237 |
164
+ | 0.2319 | 4.44 | 57000 | 1.8922 |
165
+ | 0.2425 | 4.48 | 57500 | 1.7491 |
166
+ | 0.2535 | 4.52 | 58000 | 1.6738 |
167
+ | 0.2251 | 4.56 | 58500 | 1.7717 |
168
+ | 0.2449 | 4.6 | 59000 | 1.7209 |
169
+ | 0.2472 | 4.64 | 59500 | 1.6438 |
170
+ | 0.2179 | 4.68 | 60000 | 1.8039 |
171
+ | 0.2635 | 4.71 | 60500 | 1.6948 |
172
+ | 0.2301 | 4.75 | 61000 | 1.8228 |
173
+ | 0.2454 | 4.79 | 61500 | 1.6865 |
174
+ | 0.2146 | 4.83 | 62000 | 1.8147 |
175
+ | 0.2639 | 4.87 | 62500 | 1.6340 |
176
+ | 0.2488 | 4.91 | 63000 | 1.7649 |
177
+ | 0.2448 | 4.95 | 63500 | 1.7029 |
178
+ | 0.2373 | 4.99 | 64000 | 1.8508 |
179
+ | 0.1982 | 5.03 | 64500 | 1.8193 |
180
+ | 0.1676 | 5.07 | 65000 | 1.9439 |
181
+ | 0.1397 | 5.1 | 65500 | 2.0506 |
182
+ | 0.1829 | 5.14 | 66000 | 1.9656 |
183
+ | 0.1469 | 5.18 | 66500 | 2.0149 |
184
+ | 0.2015 | 5.22 | 67000 | 1.9251 |
185
+ | 0.1728 | 5.26 | 67500 | 1.9232 |
186
+ | 0.214 | 5.3 | 68000 | 1.7829 |
187
+ | 0.1744 | 5.34 | 68500 | 2.0301 |
188
+ | 0.1734 | 5.38 | 69000 | 1.9325 |
189
+ | 0.2109 | 5.42 | 69500 | 1.9063 |
190
+ | 0.19 | 5.45 | 70000 | 1.9691 |
191
+ | 0.1947 | 5.49 | 70500 | 1.9812 |
192
+ | 0.198 | 5.53 | 71000 | 1.9603 |
193
+ | 0.1889 | 5.57 | 71500 | 1.9647 |
194
+ | 0.198 | 5.61 | 72000 | 1.8880 |
195
+ | 0.1741 | 5.65 | 72500 | 2.0263 |
196
+ | 0.1775 | 5.69 | 73000 | 1.9311 |
197
+ | 0.1971 | 5.73 | 73500 | 1.9250 |
198
+ | 0.183 | 5.77 | 74000 | 2.0464 |
199
+ | 0.1816 | 5.81 | 74500 | 1.9924 |
200
+ | 0.21 | 5.84 | 75000 | 1.8805 |
201
+ | 0.1999 | 5.88 | 75500 | 1.8812 |
202
+ | 0.2089 | 5.92 | 76000 | 1.8398 |
203
+ | 0.1945 | 5.96 | 76500 | 1.9466 |
204
+ | 0.1828 | 6.0 | 77000 | 1.9279 |
205
+ | 0.1423 | 6.04 | 77500 | 2.0748 |
206
+ | 0.1327 | 6.08 | 78000 | 2.0871 |
207
+ | 0.1297 | 6.12 | 78500 | 2.1302 |
208
+ | 0.1313 | 6.16 | 79000 | 2.1704 |
209
+ | 0.1463 | 6.19 | 79500 | 2.0676 |
210
+ | 0.1496 | 6.23 | 80000 | 2.0896 |
211
+ | 0.128 | 6.27 | 80500 | 2.2031 |
212
+ | 0.1761 | 6.31 | 81000 | 2.0441 |
213
+ | 0.15 | 6.35 | 81500 | 2.1346 |
214
+ | 0.1787 | 6.39 | 82000 | 1.9899 |
215
+ | 0.1407 | 6.43 | 82500 | 2.0616 |
216
+ | 0.1366 | 6.47 | 83000 | 2.2158 |
217
+ | 0.149 | 6.51 | 83500 | 2.1434 |
218
+ | 0.1295 | 6.55 | 84000 | 2.2094 |
219
+ | 0.1423 | 6.58 | 84500 | 2.1137 |
220
+ | 0.1595 | 6.62 | 85000 | 2.0735 |
221
+ | 0.1494 | 6.66 | 85500 | 2.0534 |
222
+ | 0.1315 | 6.7 | 86000 | 2.1229 |
223
+ | 0.1778 | 6.74 | 86500 | 2.1022 |
224
+ | 0.1234 | 6.78 | 87000 | 2.1475 |
225
+ | 0.1531 | 6.82 | 87500 | 2.0641 |
226
+ | 0.1537 | 6.86 | 88000 | 2.0913 |
227
+ | 0.1734 | 6.9 | 88500 | 2.0269 |
228
+ | 0.1531 | 6.94 | 89000 | 2.0718 |
229
+ | 0.1731 | 6.97 | 89500 | 2.0188 |
230
+ | 0.1496 | 7.01 | 90000 | 2.2257 |
231
+ | 0.1202 | 7.05 | 90500 | 2.1846 |
232
+ | 0.1125 | 7.09 | 91000 | 2.3543 |
233
+ | 0.1127 | 7.13 | 91500 | 2.3571 |
234
+ | 0.1303 | 7.17 | 92000 | 2.2526 |
235
+ | 0.1151 | 7.21 | 92500 | 2.1961 |
236
+ | 0.1148 | 7.25 | 93000 | 2.2848 |
237
+ | 0.1097 | 7.29 | 93500 | 2.3361 |
238
+ | 0.1132 | 7.32 | 94000 | 2.3850 |
239
+ | 0.0794 | 7.36 | 94500 | 2.4030 |
240
+ | 0.1133 | 7.4 | 95000 | 2.2968 |
241
+ | 0.1174 | 7.44 | 95500 | 2.2693 |
242
+ | 0.1178 | 7.48 | 96000 | 2.2723 |
243
+ | 0.0895 | 7.52 | 96500 | 2.3682 |
244
+ | 0.1269 | 7.56 | 97000 | 2.2746 |
245
+ | 0.1124 | 7.6 | 97500 | 2.2634 |
246
+ | 0.1354 | 7.64 | 98000 | 2.2400 |
247
+ | 0.1329 | 7.68 | 98500 | 2.2261 |
248
+ | 0.1363 | 7.71 | 99000 | 2.2394 |
249
+ | 0.1219 | 7.75 | 99500 | 2.2641 |
250
+ | 0.1067 | 7.79 | 100000 | 2.3639 |
251
+ | 0.1243 | 7.83 | 100500 | 2.2853 |
252
+ | 0.1429 | 7.87 | 101000 | 2.2218 |
253
+ | 0.1282 | 7.91 | 101500 | 2.2358 |
254
+ | 0.1277 | 7.95 | 102000 | 2.2241 |
255
+ | 0.143 | 7.99 | 102500 | 2.1506 |
256
+ | 0.0959 | 8.03 | 103000 | 2.2565 |
257
+ | 0.0911 | 8.07 | 103500 | 2.3629 |
258
+ | 0.0923 | 8.1 | 104000 | 2.3459 |
259
+ | 0.094 | 8.14 | 104500 | 2.3670 |
260
+ | 0.0983 | 8.18 | 105000 | 2.3862 |
261
+ | 0.114 | 8.22 | 105500 | 2.3531 |
262
+ | 0.0783 | 8.26 | 106000 | 2.4318 |
263
+ | 0.0998 | 8.3 | 106500 | 2.3581 |
264
+ | 0.0627 | 8.34 | 107000 | 2.5447 |
265
+ | 0.1007 | 8.38 | 107500 | 2.4340 |
266
+ | 0.1046 | 8.42 | 108000 | 2.4324 |
267
+ | 0.0896 | 8.45 | 108500 | 2.3896 |
268
+ | 0.1194 | 8.49 | 109000 | 2.3735 |
269
+ | 0.0913 | 8.53 | 109500 | 2.3917 |
270
+ | 0.1212 | 8.57 | 110000 | 2.3616 |
271
+ | 0.0998 | 8.61 | 110500 | 2.3847 |
272
+ | 0.0902 | 8.65 | 111000 | 2.4282 |
273
+ | 0.0786 | 8.69 | 111500 | 2.4669 |
274
+ | 0.0944 | 8.73 | 112000 | 2.4121 |
275
+ | 0.1072 | 8.77 | 112500 | 2.3918 |
276
+ | 0.1386 | 8.81 | 113000 | 2.3239 |
277
+ | 0.098 | 8.84 | 113500 | 2.3491 |
278
+ | 0.0997 | 8.88 | 114000 | 2.3698 |
279
+ | 0.1054 | 8.92 | 114500 | 2.4200 |
280
+ | 0.1069 | 8.96 | 115000 | 2.3614 |
281
+ | 0.1103 | 9.0 | 115500 | 2.3551 |
282
+ | 0.0943 | 9.04 | 116000 | 2.4380 |
283
+ | 0.0881 | 9.08 | 116500 | 2.4843 |
284
+ | 0.0665 | 9.12 | 117000 | 2.5239 |
285
+ | 0.0789 | 9.16 | 117500 | 2.5221 |
286
+ | 0.0773 | 9.2 | 118000 | 2.5397 |
287
+ | 0.0818 | 9.23 | 118500 | 2.4990 |
288
+ | 0.0684 | 9.27 | 119000 | 2.5446 |
289
+ | 0.0711 | 9.31 | 119500 | 2.5097 |
290
+ | 0.0842 | 9.35 | 120000 | 2.5173 |
291
+ | 0.0819 | 9.39 | 120500 | 2.4953 |
292
+ | 0.0753 | 9.43 | 121000 | 2.5070 |
293
+ | 0.09 | 9.47 | 121500 | 2.4626 |
294
+ | 0.0761 | 9.51 | 122000 | 2.4711 |
295
+ | 0.074 | 9.55 | 122500 | 2.4678 |
296
+ | 0.0789 | 9.58 | 123000 | 2.4595 |
297
+ | 0.0668 | 9.62 | 123500 | 2.4830 |
298
+ | 0.0912 | 9.66 | 124000 | 2.4984 |
299
+ | 0.0856 | 9.7 | 124500 | 2.4839 |
300
+ | 0.0806 | 9.74 | 125000 | 2.4717 |
301
+ | 0.0842 | 9.78 | 125500 | 2.4759 |
302
+ | 0.0876 | 9.82 | 126000 | 2.4794 |
303
+ | 0.0788 | 9.86 | 126500 | 2.4893 |
304
+ | 0.0671 | 9.9 | 127000 | 2.4955 |
305
+ | 0.0897 | 9.94 | 127500 | 2.4928 |
306
+ | 0.0685 | 9.97 | 128000 | 2.4944 |
307
 
308
 
309
  ### Framework versions
model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:200eb62b2ddf181030616f3904b90b7f2a09a8e65b4e7bce28b747712fc0e5a1
3
  size 1112905680
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9b36c353340ab62bc723e54e2cdad6ddd8807ff03a25921de69160fa82671567
3
  size 1112905680
runs/Nov28_09-36-28_Software-AI/events.out.tfevents.1701151589.Software-AI.10944.2 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:403f757cef7ee3f9bfa0b9c4ab85bda2e7de9ce7574dc731833e6ac09e690064
3
+ size 116311
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:ff320f9c9d6b606d08c02765fa1f3d1f5688dcbe354925aa77f75fb3f9795ed5
3
  size 4219
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:c38cacb29e9571f92b8a98fd4f574e038e109cc118698443e3e39c8ced5d8c86
3
  size 4219