End of training
Browse files
README.md
CHANGED
@@ -17,7 +17,7 @@ should probably proofread and complete it, then remove this comment. -->
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
-
- Loss: 2.
|
21 |
|
22 |
## Model description
|
23 |
|
@@ -36,7 +36,7 @@ More information needed
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
-
- learning_rate: 5e-05
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
@@ -48,262 +48,262 @@ The following hyperparameters were used during training:
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
-
| 0.
|
52 |
-
| 0.
|
53 |
-
| 0.
|
54 |
-
| 0.
|
55 |
-
| 0.
|
56 |
-
| 0.
|
57 |
-
| 0.
|
58 |
-
| 0.
|
59 |
-
| 0.
|
60 |
-
| 0.
|
61 |
-
| 0.
|
62 |
-
| 0.
|
63 |
-
| 0.
|
64 |
-
| 0.
|
65 |
-
| 0.
|
66 |
-
| 0.
|
67 |
-
| 0.
|
68 |
-
| 0.
|
69 |
-
| 0.
|
70 |
-
| 0.
|
71 |
-
| 0.
|
72 |
-
| 0.
|
73 |
-
| 0.
|
74 |
-
| 0.
|
75 |
-
| 0.
|
76 |
-
| 0.
|
77 |
-
| 0.
|
78 |
-
| 0.
|
79 |
-
| 0.
|
80 |
-
| 0.
|
81 |
-
| 0.
|
82 |
-
| 0.
|
83 |
-
| 0.
|
84 |
-
| 0.
|
85 |
-
| 0.
|
86 |
-
| 0.
|
87 |
-
| 0.
|
88 |
-
| 0.
|
89 |
-
| 0.
|
90 |
-
| 0.
|
91 |
-
| 0.
|
92 |
-
| 0.
|
93 |
-
| 0.
|
94 |
-
| 0.
|
95 |
-
| 0.
|
96 |
-
| 0.
|
97 |
-
| 0.
|
98 |
-
| 0.
|
99 |
-
| 0.
|
100 |
-
| 0.
|
101 |
-
| 0.
|
102 |
-
| 0.
|
103 |
-
| 0.
|
104 |
-
| 0.
|
105 |
-
| 0.
|
106 |
-
| 0.
|
107 |
-
| 0.
|
108 |
-
| 0.
|
109 |
-
| 0.
|
110 |
-
| 0.
|
111 |
-
| 0.
|
112 |
-
| 0.
|
113 |
-
| 0.
|
114 |
-
| 0.
|
115 |
-
| 0.
|
116 |
-
| 0.
|
117 |
-
| 0.
|
118 |
-
| 0.
|
119 |
-
| 0.
|
120 |
-
| 0.
|
121 |
-
| 0.
|
122 |
-
| 0.
|
123 |
-
| 0.
|
124 |
-
| 0.
|
125 |
-
| 0.
|
126 |
-
| 0.
|
127 |
-
| 0.
|
128 |
-
| 0.
|
129 |
-
| 0.
|
130 |
-
| 0.
|
131 |
-
| 0.
|
132 |
-
| 0.
|
133 |
-
| 0.
|
134 |
-
| 0.
|
135 |
-
| 0.
|
136 |
-
| 0.
|
137 |
-
| 0.
|
138 |
-
| 0.
|
139 |
-
| 0.
|
140 |
-
| 0.
|
141 |
-
| 0.
|
142 |
-
| 0.
|
143 |
-
| 0.
|
144 |
-
| 0.
|
145 |
-
| 0.
|
146 |
-
| 0.
|
147 |
-
| 0.
|
148 |
-
| 0.
|
149 |
-
| 0.
|
150 |
-
| 0.
|
151 |
-
| 0.
|
152 |
-
| 0.
|
153 |
-
| 0.
|
154 |
-
| 0.
|
155 |
-
| 0.
|
156 |
-
| 0.
|
157 |
-
| 0.
|
158 |
-
| 0.
|
159 |
-
| 0.
|
160 |
-
| 0.
|
161 |
-
| 0.
|
162 |
-
| 0.
|
163 |
-
| 0.
|
164 |
-
| 0.
|
165 |
-
| 0.
|
166 |
-
| 0.
|
167 |
-
| 0.
|
168 |
-
| 0.
|
169 |
-
| 0.
|
170 |
-
| 0.
|
171 |
-
| 0.
|
172 |
-
| 0.
|
173 |
-
| 0.
|
174 |
-
| 0.
|
175 |
-
| 0.
|
176 |
-
| 0.
|
177 |
-
| 0.
|
178 |
-
| 0.
|
179 |
-
| 0.
|
180 |
-
| 0.
|
181 |
-
| 0.
|
182 |
-
| 0.
|
183 |
-
| 0.
|
184 |
-
| 0.
|
185 |
-
| 0.
|
186 |
-
| 0.
|
187 |
-
| 0.
|
188 |
-
| 0.
|
189 |
-
| 0.
|
190 |
-
| 0.
|
191 |
-
| 0.
|
192 |
-
| 0.
|
193 |
-
| 0.
|
194 |
-
| 0.
|
195 |
-
| 0.
|
196 |
-
| 0.
|
197 |
-
| 0.
|
198 |
-
| 0.
|
199 |
-
| 0.
|
200 |
-
| 0.
|
201 |
-
| 0.
|
202 |
-
| 0.
|
203 |
-
| 0.
|
204 |
-
| 0.
|
205 |
-
| 0.
|
206 |
-
| 0.
|
207 |
-
| 0.
|
208 |
-
| 0.
|
209 |
-
| 0.
|
210 |
-
| 0.
|
211 |
-
| 0.
|
212 |
-
| 0.
|
213 |
-
| 0.
|
214 |
-
| 0.
|
215 |
-
| 0.
|
216 |
-
| 0.
|
217 |
-
| 0.
|
218 |
-
| 0.
|
219 |
-
| 0.
|
220 |
-
| 0.
|
221 |
-
| 0.
|
222 |
-
| 0.
|
223 |
-
| 0.
|
224 |
-
| 0.
|
225 |
-
| 0.
|
226 |
-
| 0.
|
227 |
-
| 0.
|
228 |
-
| 0.
|
229 |
-
| 0.
|
230 |
-
| 0.
|
231 |
-
| 0.
|
232 |
-
| 0.
|
233 |
-
| 0.
|
234 |
-
| 0.
|
235 |
-
| 0.
|
236 |
-
| 0.
|
237 |
-
| 0.
|
238 |
-
| 0.
|
239 |
-
| 0.
|
240 |
-
| 0.
|
241 |
-
| 0.
|
242 |
-
| 0.
|
243 |
-
| 0.
|
244 |
-
| 0.
|
245 |
-
| 0.
|
246 |
-
| 0.
|
247 |
-
| 0.
|
248 |
-
| 0.
|
249 |
-
| 0.
|
250 |
-
| 0.
|
251 |
-
| 0.
|
252 |
-
| 0.
|
253 |
-
| 0.
|
254 |
-
| 0.
|
255 |
-
| 0.
|
256 |
-
| 0.
|
257 |
-
| 0.
|
258 |
-
| 0.
|
259 |
-
| 0.
|
260 |
-
| 0.
|
261 |
-
| 0.
|
262 |
-
| 0.
|
263 |
-
| 0.
|
264 |
-
| 0.
|
265 |
-
| 0.
|
266 |
-
| 0.
|
267 |
-
| 0.
|
268 |
-
| 0.
|
269 |
-
| 0.
|
270 |
-
| 0.
|
271 |
-
| 0.
|
272 |
-
| 0.
|
273 |
-
| 0.
|
274 |
-
| 0.
|
275 |
-
| 0.
|
276 |
-
| 0.
|
277 |
-
| 0.
|
278 |
-
| 0.
|
279 |
-
| 0.
|
280 |
-
| 0.
|
281 |
-
| 0.
|
282 |
-
| 0.
|
283 |
-
| 0.
|
284 |
-
| 0.
|
285 |
-
| 0.
|
286 |
-
| 0.
|
287 |
-
| 0.
|
288 |
-
| 0.
|
289 |
-
| 0.
|
290 |
-
| 0.
|
291 |
-
| 0.
|
292 |
-
| 0.
|
293 |
-
| 0.
|
294 |
-
| 0.
|
295 |
-
| 0.
|
296 |
-
| 0.
|
297 |
-
| 0.
|
298 |
-
| 0.
|
299 |
-
| 0.
|
300 |
-
| 0.
|
301 |
-
| 0.
|
302 |
-
| 0.
|
303 |
-
| 0.
|
304 |
-
| 0.
|
305 |
-
| 0.
|
306 |
-
| 0.
|
307 |
|
308 |
|
309 |
### Framework versions
|
|
|
17 |
|
18 |
This model is a fine-tuned version of [makhataei/qa-persian-mdeberta-v3-base-squad2](https://huggingface.co/makhataei/qa-persian-mdeberta-v3-base-squad2) on the pquad dataset.
|
19 |
It achieves the following results on the evaluation set:
|
20 |
+
- Loss: 2.4944
|
21 |
|
22 |
## Model description
|
23 |
|
|
|
36 |
### Training hyperparameters
|
37 |
|
38 |
The following hyperparameters were used during training:
|
39 |
+
- learning_rate: 2.5e-05
|
40 |
- train_batch_size: 5
|
41 |
- eval_batch_size: 5
|
42 |
- seed: 42
|
|
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:------:|:---------------:|
|
51 |
+
| 0.4273 | 0.04 | 500 | 1.2636 |
|
52 |
+
| 0.3813 | 0.08 | 1000 | 1.2619 |
|
53 |
+
| 0.3933 | 0.12 | 1500 | 1.3319 |
|
54 |
+
| 0.3305 | 0.16 | 2000 | 1.4050 |
|
55 |
+
| 0.3604 | 0.19 | 2500 | 1.4107 |
|
56 |
+
| 0.3431 | 0.23 | 3000 | 1.3068 |
|
57 |
+
| 0.3258 | 0.27 | 3500 | 1.3487 |
|
58 |
+
| 0.3432 | 0.31 | 4000 | 1.4339 |
|
59 |
+
| 0.3429 | 0.35 | 4500 | 1.3738 |
|
60 |
+
| 0.3279 | 0.39 | 5000 | 1.4084 |
|
61 |
+
| 0.3268 | 0.43 | 5500 | 1.3671 |
|
62 |
+
| 0.3352 | 0.47 | 6000 | 1.3462 |
|
63 |
+
| 0.3233 | 0.51 | 6500 | 1.3703 |
|
64 |
+
| 0.3157 | 0.55 | 7000 | 1.4630 |
|
65 |
+
| 0.3005 | 0.58 | 7500 | 1.4817 |
|
66 |
+
| 0.2708 | 0.62 | 8000 | 1.4972 |
|
67 |
+
| 0.3227 | 0.66 | 8500 | 1.4029 |
|
68 |
+
| 0.6272 | 0.7 | 9000 | 1.0431 |
|
69 |
+
| 0.5573 | 0.74 | 9500 | 1.0920 |
|
70 |
+
| 0.5744 | 0.78 | 10000 | 1.0445 |
|
71 |
+
| 0.5128 | 0.82 | 10500 | 1.0858 |
|
72 |
+
| 0.5503 | 0.86 | 11000 | 1.0169 |
|
73 |
+
| 0.5128 | 0.9 | 11500 | 1.0771 |
|
74 |
+
| 0.5281 | 0.94 | 12000 | 1.0501 |
|
75 |
+
| 0.5347 | 0.97 | 12500 | 1.0867 |
|
76 |
+
| 0.4619 | 1.01 | 13000 | 1.2511 |
|
77 |
+
| 0.3691 | 1.05 | 13500 | 1.2085 |
|
78 |
+
| 0.3785 | 1.09 | 14000 | 1.2875 |
|
79 |
+
| 0.3309 | 1.13 | 14500 | 1.3301 |
|
80 |
+
| 0.3818 | 1.17 | 15000 | 1.2383 |
|
81 |
+
| 0.3546 | 1.21 | 15500 | 1.2575 |
|
82 |
+
| 0.3568 | 1.25 | 16000 | 1.3351 |
|
83 |
+
| 0.3475 | 1.29 | 16500 | 1.3030 |
|
84 |
+
| 0.3801 | 1.32 | 17000 | 1.3151 |
|
85 |
+
| 0.383 | 1.36 | 17500 | 1.2519 |
|
86 |
+
| 0.3878 | 1.4 | 18000 | 1.2408 |
|
87 |
+
| 0.3568 | 1.44 | 18500 | 1.2846 |
|
88 |
+
| 0.3901 | 1.48 | 19000 | 1.1482 |
|
89 |
+
| 0.3732 | 1.52 | 19500 | 1.2964 |
|
90 |
+
| 0.3585 | 1.56 | 20000 | 1.2875 |
|
91 |
+
| 0.3854 | 1.6 | 20500 | 1.2647 |
|
92 |
+
| 0.3802 | 1.64 | 21000 | 1.2905 |
|
93 |
+
| 0.3383 | 1.68 | 21500 | 1.3686 |
|
94 |
+
| 0.3809 | 1.71 | 22000 | 1.2277 |
|
95 |
+
| 0.3487 | 1.75 | 22500 | 1.3850 |
|
96 |
+
| 0.3704 | 1.79 | 23000 | 1.2682 |
|
97 |
+
| 0.3868 | 1.83 | 23500 | 1.3091 |
|
98 |
+
| 0.3772 | 1.87 | 24000 | 1.2671 |
|
99 |
+
| 0.3492 | 1.91 | 24500 | 1.3259 |
|
100 |
+
| 0.4124 | 1.95 | 25000 | 1.2334 |
|
101 |
+
| 0.3716 | 1.99 | 25500 | 1.2383 |
|
102 |
+
| 0.3068 | 2.03 | 26000 | 1.4346 |
|
103 |
+
| 0.2693 | 2.06 | 26500 | 1.5702 |
|
104 |
+
| 0.2776 | 2.1 | 27000 | 1.4791 |
|
105 |
+
| 0.2574 | 2.14 | 27500 | 1.5752 |
|
106 |
+
| 0.2764 | 2.18 | 28000 | 1.6362 |
|
107 |
+
| 0.3035 | 2.22 | 28500 | 1.5172 |
|
108 |
+
| 0.2961 | 2.26 | 29000 | 1.4787 |
|
109 |
+
| 0.3115 | 2.3 | 29500 | 1.5763 |
|
110 |
+
| 0.2846 | 2.34 | 30000 | 1.4942 |
|
111 |
+
| 0.2971 | 2.38 | 30500 | 1.4641 |
|
112 |
+
| 0.2448 | 2.42 | 31000 | 1.6608 |
|
113 |
+
| 0.2864 | 2.45 | 31500 | 1.5140 |
|
114 |
+
| 0.3112 | 2.49 | 32000 | 1.5064 |
|
115 |
+
| 0.2768 | 2.53 | 32500 | 1.6051 |
|
116 |
+
| 0.2938 | 2.57 | 33000 | 1.6976 |
|
117 |
+
| 0.2839 | 2.61 | 33500 | 1.4711 |
|
118 |
+
| 0.2675 | 2.65 | 34000 | 1.5766 |
|
119 |
+
| 0.273 | 2.69 | 34500 | 1.5526 |
|
120 |
+
| 0.2446 | 2.73 | 35000 | 1.6282 |
|
121 |
+
| 0.2921 | 2.77 | 35500 | 1.4750 |
|
122 |
+
| 0.2433 | 2.81 | 36000 | 1.5918 |
|
123 |
+
| 0.2634 | 2.84 | 36500 | 1.5804 |
|
124 |
+
| 0.2726 | 2.88 | 37000 | 1.5430 |
|
125 |
+
| 0.2678 | 2.92 | 37500 | 1.5456 |
|
126 |
+
| 0.3963 | 2.96 | 38000 | 1.4429 |
|
127 |
+
| 0.3874 | 3.0 | 38500 | 1.3743 |
|
128 |
+
| 0.2386 | 3.04 | 39000 | 1.6718 |
|
129 |
+
| 0.2666 | 3.08 | 39500 | 1.6247 |
|
130 |
+
| 0.2452 | 3.12 | 40000 | 1.6553 |
|
131 |
+
| 0.2684 | 3.16 | 40500 | 1.5948 |
|
132 |
+
| 0.2741 | 3.19 | 41000 | 1.6774 |
|
133 |
+
| 0.2915 | 3.23 | 41500 | 1.6423 |
|
134 |
+
| 0.289 | 3.27 | 42000 | 1.6159 |
|
135 |
+
| 0.2572 | 3.31 | 42500 | 1.6878 |
|
136 |
+
| 0.2888 | 3.35 | 43000 | 1.6022 |
|
137 |
+
| 0.2787 | 3.39 | 43500 | 1.6714 |
|
138 |
+
| 0.2762 | 3.43 | 44000 | 1.6734 |
|
139 |
+
| 0.304 | 3.47 | 44500 | 1.6225 |
|
140 |
+
| 0.2964 | 3.51 | 45000 | 1.6075 |
|
141 |
+
| 0.3047 | 3.55 | 45500 | 1.6200 |
|
142 |
+
| 0.2929 | 3.58 | 46000 | 1.5646 |
|
143 |
+
| 0.2828 | 3.62 | 46500 | 1.5764 |
|
144 |
+
| 0.2882 | 3.66 | 47000 | 1.6570 |
|
145 |
+
| 0.2756 | 3.7 | 47500 | 1.5030 |
|
146 |
+
| 0.2741 | 3.74 | 48000 | 1.6237 |
|
147 |
+
| 0.2819 | 3.78 | 48500 | 1.5456 |
|
148 |
+
| 0.3243 | 3.82 | 49000 | 1.5030 |
|
149 |
+
| 0.2999 | 3.86 | 49500 | 1.6339 |
|
150 |
+
| 0.2867 | 3.9 | 50000 | 1.6627 |
|
151 |
+
| 0.2834 | 3.94 | 50500 | 1.6580 |
|
152 |
+
| 0.2784 | 3.97 | 51000 | 1.6321 |
|
153 |
+
| 0.2846 | 4.01 | 51500 | 1.5986 |
|
154 |
+
| 0.2059 | 4.05 | 52000 | 1.7993 |
|
155 |
+
| 0.2204 | 4.09 | 52500 | 1.7942 |
|
156 |
+
| 0.2144 | 4.13 | 53000 | 1.7884 |
|
157 |
+
| 0.2385 | 4.17 | 53500 | 1.7064 |
|
158 |
+
| 0.2225 | 4.21 | 54000 | 1.7386 |
|
159 |
+
| 0.2119 | 4.25 | 54500 | 1.9515 |
|
160 |
+
| 0.2033 | 4.29 | 55000 | 1.8603 |
|
161 |
+
| 0.2121 | 4.32 | 55500 | 1.8144 |
|
162 |
+
| 0.2489 | 4.36 | 56000 | 1.7729 |
|
163 |
+
| 0.2284 | 4.4 | 56500 | 1.8237 |
|
164 |
+
| 0.2319 | 4.44 | 57000 | 1.8922 |
|
165 |
+
| 0.2425 | 4.48 | 57500 | 1.7491 |
|
166 |
+
| 0.2535 | 4.52 | 58000 | 1.6738 |
|
167 |
+
| 0.2251 | 4.56 | 58500 | 1.7717 |
|
168 |
+
| 0.2449 | 4.6 | 59000 | 1.7209 |
|
169 |
+
| 0.2472 | 4.64 | 59500 | 1.6438 |
|
170 |
+
| 0.2179 | 4.68 | 60000 | 1.8039 |
|
171 |
+
| 0.2635 | 4.71 | 60500 | 1.6948 |
|
172 |
+
| 0.2301 | 4.75 | 61000 | 1.8228 |
|
173 |
+
| 0.2454 | 4.79 | 61500 | 1.6865 |
|
174 |
+
| 0.2146 | 4.83 | 62000 | 1.8147 |
|
175 |
+
| 0.2639 | 4.87 | 62500 | 1.6340 |
|
176 |
+
| 0.2488 | 4.91 | 63000 | 1.7649 |
|
177 |
+
| 0.2448 | 4.95 | 63500 | 1.7029 |
|
178 |
+
| 0.2373 | 4.99 | 64000 | 1.8508 |
|
179 |
+
| 0.1982 | 5.03 | 64500 | 1.8193 |
|
180 |
+
| 0.1676 | 5.07 | 65000 | 1.9439 |
|
181 |
+
| 0.1397 | 5.1 | 65500 | 2.0506 |
|
182 |
+
| 0.1829 | 5.14 | 66000 | 1.9656 |
|
183 |
+
| 0.1469 | 5.18 | 66500 | 2.0149 |
|
184 |
+
| 0.2015 | 5.22 | 67000 | 1.9251 |
|
185 |
+
| 0.1728 | 5.26 | 67500 | 1.9232 |
|
186 |
+
| 0.214 | 5.3 | 68000 | 1.7829 |
|
187 |
+
| 0.1744 | 5.34 | 68500 | 2.0301 |
|
188 |
+
| 0.1734 | 5.38 | 69000 | 1.9325 |
|
189 |
+
| 0.2109 | 5.42 | 69500 | 1.9063 |
|
190 |
+
| 0.19 | 5.45 | 70000 | 1.9691 |
|
191 |
+
| 0.1947 | 5.49 | 70500 | 1.9812 |
|
192 |
+
| 0.198 | 5.53 | 71000 | 1.9603 |
|
193 |
+
| 0.1889 | 5.57 | 71500 | 1.9647 |
|
194 |
+
| 0.198 | 5.61 | 72000 | 1.8880 |
|
195 |
+
| 0.1741 | 5.65 | 72500 | 2.0263 |
|
196 |
+
| 0.1775 | 5.69 | 73000 | 1.9311 |
|
197 |
+
| 0.1971 | 5.73 | 73500 | 1.9250 |
|
198 |
+
| 0.183 | 5.77 | 74000 | 2.0464 |
|
199 |
+
| 0.1816 | 5.81 | 74500 | 1.9924 |
|
200 |
+
| 0.21 | 5.84 | 75000 | 1.8805 |
|
201 |
+
| 0.1999 | 5.88 | 75500 | 1.8812 |
|
202 |
+
| 0.2089 | 5.92 | 76000 | 1.8398 |
|
203 |
+
| 0.1945 | 5.96 | 76500 | 1.9466 |
|
204 |
+
| 0.1828 | 6.0 | 77000 | 1.9279 |
|
205 |
+
| 0.1423 | 6.04 | 77500 | 2.0748 |
|
206 |
+
| 0.1327 | 6.08 | 78000 | 2.0871 |
|
207 |
+
| 0.1297 | 6.12 | 78500 | 2.1302 |
|
208 |
+
| 0.1313 | 6.16 | 79000 | 2.1704 |
|
209 |
+
| 0.1463 | 6.19 | 79500 | 2.0676 |
|
210 |
+
| 0.1496 | 6.23 | 80000 | 2.0896 |
|
211 |
+
| 0.128 | 6.27 | 80500 | 2.2031 |
|
212 |
+
| 0.1761 | 6.31 | 81000 | 2.0441 |
|
213 |
+
| 0.15 | 6.35 | 81500 | 2.1346 |
|
214 |
+
| 0.1787 | 6.39 | 82000 | 1.9899 |
|
215 |
+
| 0.1407 | 6.43 | 82500 | 2.0616 |
|
216 |
+
| 0.1366 | 6.47 | 83000 | 2.2158 |
|
217 |
+
| 0.149 | 6.51 | 83500 | 2.1434 |
|
218 |
+
| 0.1295 | 6.55 | 84000 | 2.2094 |
|
219 |
+
| 0.1423 | 6.58 | 84500 | 2.1137 |
|
220 |
+
| 0.1595 | 6.62 | 85000 | 2.0735 |
|
221 |
+
| 0.1494 | 6.66 | 85500 | 2.0534 |
|
222 |
+
| 0.1315 | 6.7 | 86000 | 2.1229 |
|
223 |
+
| 0.1778 | 6.74 | 86500 | 2.1022 |
|
224 |
+
| 0.1234 | 6.78 | 87000 | 2.1475 |
|
225 |
+
| 0.1531 | 6.82 | 87500 | 2.0641 |
|
226 |
+
| 0.1537 | 6.86 | 88000 | 2.0913 |
|
227 |
+
| 0.1734 | 6.9 | 88500 | 2.0269 |
|
228 |
+
| 0.1531 | 6.94 | 89000 | 2.0718 |
|
229 |
+
| 0.1731 | 6.97 | 89500 | 2.0188 |
|
230 |
+
| 0.1496 | 7.01 | 90000 | 2.2257 |
|
231 |
+
| 0.1202 | 7.05 | 90500 | 2.1846 |
|
232 |
+
| 0.1125 | 7.09 | 91000 | 2.3543 |
|
233 |
+
| 0.1127 | 7.13 | 91500 | 2.3571 |
|
234 |
+
| 0.1303 | 7.17 | 92000 | 2.2526 |
|
235 |
+
| 0.1151 | 7.21 | 92500 | 2.1961 |
|
236 |
+
| 0.1148 | 7.25 | 93000 | 2.2848 |
|
237 |
+
| 0.1097 | 7.29 | 93500 | 2.3361 |
|
238 |
+
| 0.1132 | 7.32 | 94000 | 2.3850 |
|
239 |
+
| 0.0794 | 7.36 | 94500 | 2.4030 |
|
240 |
+
| 0.1133 | 7.4 | 95000 | 2.2968 |
|
241 |
+
| 0.1174 | 7.44 | 95500 | 2.2693 |
|
242 |
+
| 0.1178 | 7.48 | 96000 | 2.2723 |
|
243 |
+
| 0.0895 | 7.52 | 96500 | 2.3682 |
|
244 |
+
| 0.1269 | 7.56 | 97000 | 2.2746 |
|
245 |
+
| 0.1124 | 7.6 | 97500 | 2.2634 |
|
246 |
+
| 0.1354 | 7.64 | 98000 | 2.2400 |
|
247 |
+
| 0.1329 | 7.68 | 98500 | 2.2261 |
|
248 |
+
| 0.1363 | 7.71 | 99000 | 2.2394 |
|
249 |
+
| 0.1219 | 7.75 | 99500 | 2.2641 |
|
250 |
+
| 0.1067 | 7.79 | 100000 | 2.3639 |
|
251 |
+
| 0.1243 | 7.83 | 100500 | 2.2853 |
|
252 |
+
| 0.1429 | 7.87 | 101000 | 2.2218 |
|
253 |
+
| 0.1282 | 7.91 | 101500 | 2.2358 |
|
254 |
+
| 0.1277 | 7.95 | 102000 | 2.2241 |
|
255 |
+
| 0.143 | 7.99 | 102500 | 2.1506 |
|
256 |
+
| 0.0959 | 8.03 | 103000 | 2.2565 |
|
257 |
+
| 0.0911 | 8.07 | 103500 | 2.3629 |
|
258 |
+
| 0.0923 | 8.1 | 104000 | 2.3459 |
|
259 |
+
| 0.094 | 8.14 | 104500 | 2.3670 |
|
260 |
+
| 0.0983 | 8.18 | 105000 | 2.3862 |
|
261 |
+
| 0.114 | 8.22 | 105500 | 2.3531 |
|
262 |
+
| 0.0783 | 8.26 | 106000 | 2.4318 |
|
263 |
+
| 0.0998 | 8.3 | 106500 | 2.3581 |
|
264 |
+
| 0.0627 | 8.34 | 107000 | 2.5447 |
|
265 |
+
| 0.1007 | 8.38 | 107500 | 2.4340 |
|
266 |
+
| 0.1046 | 8.42 | 108000 | 2.4324 |
|
267 |
+
| 0.0896 | 8.45 | 108500 | 2.3896 |
|
268 |
+
| 0.1194 | 8.49 | 109000 | 2.3735 |
|
269 |
+
| 0.0913 | 8.53 | 109500 | 2.3917 |
|
270 |
+
| 0.1212 | 8.57 | 110000 | 2.3616 |
|
271 |
+
| 0.0998 | 8.61 | 110500 | 2.3847 |
|
272 |
+
| 0.0902 | 8.65 | 111000 | 2.4282 |
|
273 |
+
| 0.0786 | 8.69 | 111500 | 2.4669 |
|
274 |
+
| 0.0944 | 8.73 | 112000 | 2.4121 |
|
275 |
+
| 0.1072 | 8.77 | 112500 | 2.3918 |
|
276 |
+
| 0.1386 | 8.81 | 113000 | 2.3239 |
|
277 |
+
| 0.098 | 8.84 | 113500 | 2.3491 |
|
278 |
+
| 0.0997 | 8.88 | 114000 | 2.3698 |
|
279 |
+
| 0.1054 | 8.92 | 114500 | 2.4200 |
|
280 |
+
| 0.1069 | 8.96 | 115000 | 2.3614 |
|
281 |
+
| 0.1103 | 9.0 | 115500 | 2.3551 |
|
282 |
+
| 0.0943 | 9.04 | 116000 | 2.4380 |
|
283 |
+
| 0.0881 | 9.08 | 116500 | 2.4843 |
|
284 |
+
| 0.0665 | 9.12 | 117000 | 2.5239 |
|
285 |
+
| 0.0789 | 9.16 | 117500 | 2.5221 |
|
286 |
+
| 0.0773 | 9.2 | 118000 | 2.5397 |
|
287 |
+
| 0.0818 | 9.23 | 118500 | 2.4990 |
|
288 |
+
| 0.0684 | 9.27 | 119000 | 2.5446 |
|
289 |
+
| 0.0711 | 9.31 | 119500 | 2.5097 |
|
290 |
+
| 0.0842 | 9.35 | 120000 | 2.5173 |
|
291 |
+
| 0.0819 | 9.39 | 120500 | 2.4953 |
|
292 |
+
| 0.0753 | 9.43 | 121000 | 2.5070 |
|
293 |
+
| 0.09 | 9.47 | 121500 | 2.4626 |
|
294 |
+
| 0.0761 | 9.51 | 122000 | 2.4711 |
|
295 |
+
| 0.074 | 9.55 | 122500 | 2.4678 |
|
296 |
+
| 0.0789 | 9.58 | 123000 | 2.4595 |
|
297 |
+
| 0.0668 | 9.62 | 123500 | 2.4830 |
|
298 |
+
| 0.0912 | 9.66 | 124000 | 2.4984 |
|
299 |
+
| 0.0856 | 9.7 | 124500 | 2.4839 |
|
300 |
+
| 0.0806 | 9.74 | 125000 | 2.4717 |
|
301 |
+
| 0.0842 | 9.78 | 125500 | 2.4759 |
|
302 |
+
| 0.0876 | 9.82 | 126000 | 2.4794 |
|
303 |
+
| 0.0788 | 9.86 | 126500 | 2.4893 |
|
304 |
+
| 0.0671 | 9.9 | 127000 | 2.4955 |
|
305 |
+
| 0.0897 | 9.94 | 127500 | 2.4928 |
|
306 |
+
| 0.0685 | 9.97 | 128000 | 2.4944 |
|
307 |
|
308 |
|
309 |
### Framework versions
|
model.safetensors
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 1112905680
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:9b36c353340ab62bc723e54e2cdad6ddd8807ff03a25921de69160fa82671567
|
3 |
size 1112905680
|
runs/Nov28_09-36-28_Software-AI/events.out.tfevents.1701151589.Software-AI.10944.2
ADDED
@@ -0,0 +1,3 @@
|
|
|
|
|
|
|
|
|
1 |
+
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:403f757cef7ee3f9bfa0b9c4ab85bda2e7de9ce7574dc731833e6ac09e690064
|
3 |
+
size 116311
|
training_args.bin
CHANGED
@@ -1,3 +1,3 @@
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
-
oid sha256:
|
3 |
size 4219
|
|
|
1 |
version https://git-lfs.github.com/spec/v1
|
2 |
+
oid sha256:c38cacb29e9571f92b8a98fd4f574e038e109cc118698443e3e39c8ced5d8c86
|
3 |
size 4219
|