File size: 22,235 Bytes
07fa307
 
 
 
 
 
835450b
07fa307
 
 
 
 
 
835450b
07fa307
 
 
835450b
 
 
 
07fa307
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
835450b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
07fa307
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
---
library_name: transformers
base_model: aubmindlab/bert-base-arabertv02
tags:
- generated_from_trainer
model-index:
- name: ArabicNewSplits8_usingALLEssays_FineTuningAraBERT_run2_AugV5_k12_task2_organization
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ArabicNewSplits8_usingALLEssays_FineTuningAraBERT_run2_AugV5_k12_task2_organization

This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.6468
- Qwk: 0.4617
- Mse: 0.6468
- Rmse: 0.8042

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Qwk     | Mse    | Rmse   |
|:-------------:|:------:|:----:|:---------------:|:-------:|:------:|:------:|
| No log        | 0.0317 | 2    | 4.1985          | -0.0097 | 4.1985 | 2.0490 |
| No log        | 0.0635 | 4    | 2.3389          | 0.0609  | 2.3389 | 1.5293 |
| No log        | 0.0952 | 6    | 1.3887          | 0.0254  | 1.3887 | 1.1784 |
| No log        | 0.1270 | 8    | 1.0340          | -0.0042 | 1.0340 | 1.0169 |
| No log        | 0.1587 | 10   | 0.8402          | 0.1410  | 0.8402 | 0.9166 |
| No log        | 0.1905 | 12   | 0.8571          | 0.0447  | 0.8571 | 0.9258 |
| No log        | 0.2222 | 14   | 1.1039          | -0.0127 | 1.1039 | 1.0507 |
| No log        | 0.2540 | 16   | 1.0242          | 0.0208  | 1.0242 | 1.0120 |
| No log        | 0.2857 | 18   | 1.0984          | 0.0287  | 1.0984 | 1.0480 |
| No log        | 0.3175 | 20   | 0.9593          | 0.1298  | 0.9593 | 0.9794 |
| No log        | 0.3492 | 22   | 1.0632          | 0.1363  | 1.0632 | 1.0311 |
| No log        | 0.3810 | 24   | 0.8693          | 0.2558  | 0.8693 | 0.9324 |
| No log        | 0.4127 | 26   | 1.1422          | 0.2340  | 1.1422 | 1.0687 |
| No log        | 0.4444 | 28   | 1.2859          | 0.2038  | 1.2859 | 1.1340 |
| No log        | 0.4762 | 30   | 1.0016          | 0.2941  | 1.0016 | 1.0008 |
| No log        | 0.5079 | 32   | 0.6653          | 0.3833  | 0.6653 | 0.8156 |
| No log        | 0.5397 | 34   | 0.6925          | 0.3230  | 0.6925 | 0.8321 |
| No log        | 0.5714 | 36   | 0.7306          | 0.3205  | 0.7306 | 0.8548 |
| No log        | 0.6032 | 38   | 0.8142          | 0.3063  | 0.8142 | 0.9023 |
| No log        | 0.6349 | 40   | 0.8084          | 0.3482  | 0.8084 | 0.8991 |
| No log        | 0.6667 | 42   | 0.8385          | 0.2830  | 0.8385 | 0.9157 |
| No log        | 0.6984 | 44   | 1.1997          | 0.2771  | 1.1997 | 1.0953 |
| No log        | 0.7302 | 46   | 1.6326          | 0.1908  | 1.6326 | 1.2777 |
| No log        | 0.7619 | 48   | 1.6272          | 0.2067  | 1.6272 | 1.2756 |
| No log        | 0.7937 | 50   | 1.1244          | 0.2275  | 1.1244 | 1.0604 |
| No log        | 0.8254 | 52   | 0.8440          | 0.2473  | 0.8440 | 0.9187 |
| No log        | 0.8571 | 54   | 0.7300          | 0.3645  | 0.7300 | 0.8544 |
| No log        | 0.8889 | 56   | 0.7630          | 0.3544  | 0.7630 | 0.8735 |
| No log        | 0.9206 | 58   | 0.8919          | 0.2051  | 0.8919 | 0.9444 |
| No log        | 0.9524 | 60   | 1.0523          | 0.1590  | 1.0523 | 1.0258 |
| No log        | 0.9841 | 62   | 1.0067          | 0.1808  | 1.0067 | 1.0034 |
| No log        | 1.0159 | 64   | 0.9447          | 0.2086  | 0.9447 | 0.9720 |
| No log        | 1.0476 | 66   | 0.8764          | 0.1992  | 0.8764 | 0.9361 |
| No log        | 1.0794 | 68   | 0.8308          | 0.2396  | 0.8308 | 0.9115 |
| No log        | 1.1111 | 70   | 0.9603          | 0.3266  | 0.9603 | 0.9799 |
| No log        | 1.1429 | 72   | 1.2139          | 0.2778  | 1.2139 | 1.1018 |
| No log        | 1.1746 | 74   | 1.0292          | 0.3679  | 1.0292 | 1.0145 |
| No log        | 1.2063 | 76   | 0.7608          | 0.3933  | 0.7608 | 0.8722 |
| No log        | 1.2381 | 78   | 0.6512          | 0.4590  | 0.6512 | 0.8070 |
| No log        | 1.2698 | 80   | 0.6483          | 0.4420  | 0.6483 | 0.8052 |
| No log        | 1.3016 | 82   | 0.6950          | 0.4260  | 0.6950 | 0.8337 |
| No log        | 1.3333 | 84   | 0.7048          | 0.4519  | 0.7048 | 0.8396 |
| No log        | 1.3651 | 86   | 0.7372          | 0.4297  | 0.7372 | 0.8586 |
| No log        | 1.3968 | 88   | 0.7111          | 0.4691  | 0.7111 | 0.8433 |
| No log        | 1.4286 | 90   | 0.6758          | 0.4643  | 0.6758 | 0.8221 |
| No log        | 1.4603 | 92   | 0.6958          | 0.4707  | 0.6958 | 0.8341 |
| No log        | 1.4921 | 94   | 0.7769          | 0.4336  | 0.7769 | 0.8814 |
| No log        | 1.5238 | 96   | 1.0442          | 0.3690  | 1.0442 | 1.0219 |
| No log        | 1.5556 | 98   | 1.0660          | 0.3389  | 1.0660 | 1.0325 |
| No log        | 1.5873 | 100  | 0.7735          | 0.3181  | 0.7735 | 0.8795 |
| No log        | 1.6190 | 102  | 0.6747          | 0.3672  | 0.6747 | 0.8214 |
| No log        | 1.6508 | 104  | 0.6733          | 0.3640  | 0.6733 | 0.8205 |
| No log        | 1.6825 | 106  | 0.7902          | 0.3555  | 0.7902 | 0.8889 |
| No log        | 1.7143 | 108  | 1.0763          | 0.3673  | 1.0763 | 1.0375 |
| No log        | 1.7460 | 110  | 1.1485          | 0.3581  | 1.1485 | 1.0717 |
| No log        | 1.7778 | 112  | 0.9092          | 0.3950  | 0.9092 | 0.9535 |
| No log        | 1.8095 | 114  | 0.7695          | 0.3281  | 0.7695 | 0.8772 |
| No log        | 1.8413 | 116  | 0.7634          | 0.3614  | 0.7634 | 0.8737 |
| No log        | 1.8730 | 118  | 0.8207          | 0.4102  | 0.8207 | 0.9059 |
| No log        | 1.9048 | 120  | 1.0257          | 0.4551  | 1.0257 | 1.0128 |
| No log        | 1.9365 | 122  | 0.9929          | 0.4685  | 0.9929 | 0.9965 |
| No log        | 1.9683 | 124  | 0.7427          | 0.4134  | 0.7427 | 0.8618 |
| No log        | 2.0    | 126  | 0.6762          | 0.4837  | 0.6762 | 0.8223 |
| No log        | 2.0317 | 128  | 0.6480          | 0.4257  | 0.6480 | 0.8050 |
| No log        | 2.0635 | 130  | 0.6444          | 0.3738  | 0.6444 | 0.8028 |
| No log        | 2.0952 | 132  | 0.7345          | 0.4020  | 0.7345 | 0.8570 |
| No log        | 2.1270 | 134  | 0.8672          | 0.4290  | 0.8672 | 0.9312 |
| No log        | 2.1587 | 136  | 0.9554          | 0.4172  | 0.9554 | 0.9774 |
| No log        | 2.1905 | 138  | 0.8992          | 0.4145  | 0.8992 | 0.9483 |
| No log        | 2.2222 | 140  | 0.8570          | 0.4258  | 0.8570 | 0.9258 |
| No log        | 2.2540 | 142  | 0.6909          | 0.4308  | 0.6909 | 0.8312 |
| No log        | 2.2857 | 144  | 0.6653          | 0.4853  | 0.6653 | 0.8156 |
| No log        | 2.3175 | 146  | 0.7009          | 0.5102  | 0.7009 | 0.8372 |
| No log        | 2.3492 | 148  | 0.7075          | 0.5620  | 0.7075 | 0.8412 |
| No log        | 2.3810 | 150  | 0.8354          | 0.4990  | 0.8354 | 0.9140 |
| No log        | 2.4127 | 152  | 0.8771          | 0.4824  | 0.8771 | 0.9365 |
| No log        | 2.4444 | 154  | 1.0004          | 0.3834  | 1.0004 | 1.0002 |
| No log        | 2.4762 | 156  | 0.9419          | 0.3670  | 0.9419 | 0.9705 |
| No log        | 2.5079 | 158  | 0.9602          | 0.3670  | 0.9602 | 0.9799 |
| No log        | 2.5397 | 160  | 0.8200          | 0.4941  | 0.8200 | 0.9056 |
| No log        | 2.5714 | 162  | 0.8522          | 0.4330  | 0.8522 | 0.9231 |
| No log        | 2.6032 | 164  | 0.7161          | 0.5053  | 0.7161 | 0.8462 |
| No log        | 2.6349 | 166  | 0.7169          | 0.4996  | 0.7169 | 0.8467 |
| No log        | 2.6667 | 168  | 0.8542          | 0.4521  | 0.8542 | 0.9242 |
| No log        | 2.6984 | 170  | 1.0408          | 0.3563  | 1.0408 | 1.0202 |
| No log        | 2.7302 | 172  | 0.8155          | 0.4280  | 0.8155 | 0.9031 |
| No log        | 2.7619 | 174  | 0.6945          | 0.4898  | 0.6945 | 0.8334 |
| No log        | 2.7937 | 176  | 0.6694          | 0.4345  | 0.6694 | 0.8182 |
| No log        | 2.8254 | 178  | 0.6659          | 0.4241  | 0.6659 | 0.8160 |
| No log        | 2.8571 | 180  | 0.7606          | 0.3964  | 0.7606 | 0.8721 |
| No log        | 2.8889 | 182  | 0.7990          | 0.3795  | 0.7990 | 0.8939 |
| No log        | 2.9206 | 184  | 0.7722          | 0.4240  | 0.7722 | 0.8787 |
| No log        | 2.9524 | 186  | 0.7273          | 0.4113  | 0.7273 | 0.8528 |
| No log        | 2.9841 | 188  | 0.7928          | 0.4033  | 0.7928 | 0.8904 |
| No log        | 3.0159 | 190  | 0.7647          | 0.4425  | 0.7647 | 0.8745 |
| No log        | 3.0476 | 192  | 0.8654          | 0.3742  | 0.8654 | 0.9303 |
| No log        | 3.0794 | 194  | 1.1018          | 0.3700  | 1.1018 | 1.0497 |
| No log        | 3.1111 | 196  | 0.9714          | 0.3796  | 0.9714 | 0.9856 |
| No log        | 3.1429 | 198  | 0.7754          | 0.4422  | 0.7754 | 0.8805 |
| No log        | 3.1746 | 200  | 0.6841          | 0.4497  | 0.6841 | 0.8271 |
| No log        | 3.2063 | 202  | 0.6570          | 0.4049  | 0.6570 | 0.8105 |
| No log        | 3.2381 | 204  | 0.6857          | 0.4398  | 0.6857 | 0.8281 |
| No log        | 3.2698 | 206  | 0.7655          | 0.4252  | 0.7655 | 0.8749 |
| No log        | 3.3016 | 208  | 0.7268          | 0.4410  | 0.7268 | 0.8525 |
| No log        | 3.3333 | 210  | 0.6673          | 0.4874  | 0.6673 | 0.8169 |
| No log        | 3.3651 | 212  | 0.7260          | 0.4553  | 0.7260 | 0.8520 |
| No log        | 3.3968 | 214  | 0.8206          | 0.4075  | 0.8206 | 0.9059 |
| No log        | 3.4286 | 216  | 0.7519          | 0.4461  | 0.7519 | 0.8671 |
| No log        | 3.4603 | 218  | 0.7193          | 0.5170  | 0.7194 | 0.8481 |
| No log        | 3.4921 | 220  | 0.7384          | 0.4849  | 0.7384 | 0.8593 |
| No log        | 3.5238 | 222  | 0.7119          | 0.5197  | 0.7119 | 0.8437 |
| No log        | 3.5556 | 224  | 0.7124          | 0.4656  | 0.7124 | 0.8440 |
| No log        | 3.5873 | 226  | 0.7763          | 0.4065  | 0.7763 | 0.8811 |
| No log        | 3.6190 | 228  | 0.7239          | 0.4628  | 0.7239 | 0.8508 |
| No log        | 3.6508 | 230  | 0.7355          | 0.4361  | 0.7355 | 0.8576 |
| No log        | 3.6825 | 232  | 0.7074          | 0.4097  | 0.7074 | 0.8410 |
| No log        | 3.7143 | 234  | 0.7182          | 0.4153  | 0.7182 | 0.8475 |
| No log        | 3.7460 | 236  | 0.7361          | 0.4051  | 0.7361 | 0.8579 |
| No log        | 3.7778 | 238  | 0.7425          | 0.4419  | 0.7425 | 0.8617 |
| No log        | 3.8095 | 240  | 0.7510          | 0.4477  | 0.7510 | 0.8666 |
| No log        | 3.8413 | 242  | 0.7481          | 0.4813  | 0.7481 | 0.8649 |
| No log        | 3.8730 | 244  | 0.7766          | 0.4279  | 0.7766 | 0.8812 |
| No log        | 3.9048 | 246  | 0.7553          | 0.4472  | 0.7553 | 0.8691 |
| No log        | 3.9365 | 248  | 0.7446          | 0.4592  | 0.7446 | 0.8629 |
| No log        | 3.9683 | 250  | 0.7951          | 0.4476  | 0.7951 | 0.8917 |
| No log        | 4.0    | 252  | 0.7595          | 0.4588  | 0.7595 | 0.8715 |
| No log        | 4.0317 | 254  | 0.6920          | 0.4310  | 0.6920 | 0.8319 |
| No log        | 4.0635 | 256  | 0.6952          | 0.3562  | 0.6952 | 0.8338 |
| No log        | 4.0952 | 258  | 0.6660          | 0.2709  | 0.6660 | 0.8161 |
| No log        | 4.1270 | 260  | 0.6759          | 0.3952  | 0.6759 | 0.8221 |
| No log        | 4.1587 | 262  | 0.7101          | 0.3425  | 0.7101 | 0.8427 |
| No log        | 4.1905 | 264  | 0.8133          | 0.3926  | 0.8133 | 0.9018 |
| No log        | 4.2222 | 266  | 0.7580          | 0.3699  | 0.7580 | 0.8707 |
| No log        | 4.2540 | 268  | 0.7293          | 0.4621  | 0.7293 | 0.8540 |
| No log        | 4.2857 | 270  | 0.7459          | 0.4545  | 0.7459 | 0.8636 |
| No log        | 4.3175 | 272  | 0.7806          | 0.4927  | 0.7806 | 0.8835 |
| No log        | 4.3492 | 274  | 0.9133          | 0.3753  | 0.9133 | 0.9557 |
| No log        | 4.3810 | 276  | 1.0909          | 0.3795  | 1.0909 | 1.0445 |
| No log        | 4.4127 | 278  | 0.9129          | 0.3876  | 0.9129 | 0.9555 |
| No log        | 4.4444 | 280  | 0.6822          | 0.3725  | 0.6822 | 0.8260 |
| No log        | 4.4762 | 282  | 0.7058          | 0.4425  | 0.7058 | 0.8401 |
| No log        | 4.5079 | 284  | 0.6988          | 0.4201  | 0.6988 | 0.8360 |
| No log        | 4.5397 | 286  | 0.7506          | 0.3943  | 0.7506 | 0.8664 |
| No log        | 4.5714 | 288  | 0.8990          | 0.3628  | 0.8990 | 0.9482 |
| No log        | 4.6032 | 290  | 0.8977          | 0.3761  | 0.8977 | 0.9475 |
| No log        | 4.6349 | 292  | 0.7466          | 0.3496  | 0.7466 | 0.8641 |
| No log        | 4.6667 | 294  | 0.7251          | 0.4297  | 0.7251 | 0.8515 |
| No log        | 4.6984 | 296  | 0.7432          | 0.4143  | 0.7432 | 0.8621 |
| No log        | 4.7302 | 298  | 0.6917          | 0.4593  | 0.6917 | 0.8317 |
| No log        | 4.7619 | 300  | 0.7187          | 0.3942  | 0.7187 | 0.8477 |
| No log        | 4.7937 | 302  | 0.9081          | 0.3797  | 0.9081 | 0.9530 |
| No log        | 4.8254 | 304  | 0.9147          | 0.3876  | 0.9147 | 0.9564 |
| No log        | 4.8571 | 306  | 0.7353          | 0.4329  | 0.7353 | 0.8575 |
| No log        | 4.8889 | 308  | 0.6900          | 0.4625  | 0.6900 | 0.8306 |
| No log        | 4.9206 | 310  | 0.7806          | 0.4528  | 0.7806 | 0.8835 |
| No log        | 4.9524 | 312  | 0.7511          | 0.4651  | 0.7511 | 0.8666 |
| No log        | 4.9841 | 314  | 0.6900          | 0.4703  | 0.6900 | 0.8307 |
| No log        | 5.0159 | 316  | 0.6448          | 0.5051  | 0.6448 | 0.8030 |
| No log        | 5.0476 | 318  | 0.7429          | 0.4589  | 0.7429 | 0.8619 |
| No log        | 5.0794 | 320  | 0.9207          | 0.4307  | 0.9207 | 0.9595 |
| No log        | 5.1111 | 322  | 0.8850          | 0.4209  | 0.8850 | 0.9407 |
| No log        | 5.1429 | 324  | 0.7212          | 0.4714  | 0.7212 | 0.8492 |
| No log        | 5.1746 | 326  | 0.6456          | 0.4650  | 0.6456 | 0.8035 |
| No log        | 5.2063 | 328  | 0.6902          | 0.5162  | 0.6902 | 0.8308 |
| No log        | 5.2381 | 330  | 0.6686          | 0.4582  | 0.6686 | 0.8177 |
| No log        | 5.2698 | 332  | 0.6229          | 0.3456  | 0.6229 | 0.7892 |
| No log        | 5.3016 | 334  | 0.6483          | 0.3310  | 0.6483 | 0.8052 |
| No log        | 5.3333 | 336  | 0.7364          | 0.3822  | 0.7364 | 0.8581 |
| No log        | 5.3651 | 338  | 0.7452          | 0.3977  | 0.7452 | 0.8633 |
| No log        | 5.3968 | 340  | 0.6764          | 0.4152  | 0.6764 | 0.8225 |
| No log        | 5.4286 | 342  | 0.6695          | 0.4104  | 0.6695 | 0.8182 |
| No log        | 5.4603 | 344  | 0.6998          | 0.4603  | 0.6998 | 0.8365 |
| No log        | 5.4921 | 346  | 0.7254          | 0.4840  | 0.7254 | 0.8517 |
| No log        | 5.5238 | 348  | 0.7975          | 0.3978  | 0.7975 | 0.8930 |
| No log        | 5.5556 | 350  | 1.0017          | 0.4235  | 1.0017 | 1.0009 |
| No log        | 5.5873 | 352  | 0.9517          | 0.4174  | 0.9517 | 0.9755 |
| No log        | 5.6190 | 354  | 0.8220          | 0.3854  | 0.8220 | 0.9067 |
| No log        | 5.6508 | 356  | 0.7428          | 0.4111  | 0.7428 | 0.8618 |
| No log        | 5.6825 | 358  | 0.7048          | 0.5193  | 0.7048 | 0.8396 |
| No log        | 5.7143 | 360  | 0.7686          | 0.4296  | 0.7686 | 0.8767 |
| No log        | 5.7460 | 362  | 0.7461          | 0.4348  | 0.7461 | 0.8638 |
| No log        | 5.7778 | 364  | 0.6796          | 0.3505  | 0.6796 | 0.8244 |
| No log        | 5.8095 | 366  | 0.6641          | 0.4583  | 0.6641 | 0.8149 |
| No log        | 5.8413 | 368  | 0.7783          | 0.3964  | 0.7783 | 0.8822 |
| No log        | 5.8730 | 370  | 0.8155          | 0.3828  | 0.8155 | 0.9030 |
| No log        | 5.9048 | 372  | 0.7156          | 0.4961  | 0.7156 | 0.8459 |
| No log        | 5.9365 | 374  | 0.6841          | 0.4549  | 0.6841 | 0.8271 |
| No log        | 5.9683 | 376  | 0.7285          | 0.4466  | 0.7285 | 0.8535 |
| No log        | 6.0    | 378  | 0.7280          | 0.5318  | 0.7280 | 0.8532 |
| No log        | 6.0317 | 380  | 0.7640          | 0.4743  | 0.7640 | 0.8741 |
| No log        | 6.0635 | 382  | 0.9156          | 0.3898  | 0.9156 | 0.9569 |
| No log        | 6.0952 | 384  | 0.9746          | 0.3883  | 0.9746 | 0.9872 |
| No log        | 6.1270 | 386  | 0.8185          | 0.4267  | 0.8185 | 0.9047 |
| No log        | 6.1587 | 388  | 0.6631          | 0.3985  | 0.6631 | 0.8143 |
| No log        | 6.1905 | 390  | 0.6711          | 0.4539  | 0.6711 | 0.8192 |
| No log        | 6.2222 | 392  | 0.6495          | 0.4038  | 0.6495 | 0.8059 |
| No log        | 6.2540 | 394  | 0.6589          | 0.4357  | 0.6589 | 0.8117 |
| No log        | 6.2857 | 396  | 0.6898          | 0.4198  | 0.6898 | 0.8305 |
| No log        | 6.3175 | 398  | 0.6933          | 0.4257  | 0.6933 | 0.8326 |
| No log        | 6.3492 | 400  | 0.6969          | 0.4285  | 0.6969 | 0.8348 |
| No log        | 6.3810 | 402  | 0.6821          | 0.4481  | 0.6821 | 0.8259 |
| No log        | 6.4127 | 404  | 0.7897          | 0.4080  | 0.7897 | 0.8886 |
| No log        | 6.4444 | 406  | 0.8443          | 0.4138  | 0.8443 | 0.9188 |
| No log        | 6.4762 | 408  | 0.7324          | 0.4531  | 0.7324 | 0.8558 |
| No log        | 6.5079 | 410  | 0.6474          | 0.4296  | 0.6474 | 0.8046 |
| No log        | 6.5397 | 412  | 0.6369          | 0.4666  | 0.6369 | 0.7981 |
| No log        | 6.5714 | 414  | 0.6731          | 0.4711  | 0.6731 | 0.8204 |
| No log        | 6.6032 | 416  | 0.6760          | 0.4693  | 0.6760 | 0.8222 |
| No log        | 6.6349 | 418  | 0.6324          | 0.4742  | 0.6324 | 0.7953 |
| No log        | 6.6667 | 420  | 0.6169          | 0.4049  | 0.6169 | 0.7854 |
| No log        | 6.6984 | 422  | 0.6636          | 0.4032  | 0.6636 | 0.8146 |
| No log        | 6.7302 | 424  | 0.6790          | 0.3615  | 0.6790 | 0.8240 |
| No log        | 6.7619 | 426  | 0.6525          | 0.3764  | 0.6525 | 0.8078 |
| No log        | 6.7937 | 428  | 0.6404          | 0.4799  | 0.6404 | 0.8003 |
| No log        | 6.8254 | 430  | 0.6305          | 0.4262  | 0.6305 | 0.7941 |
| No log        | 6.8571 | 432  | 0.6465          | 0.4476  | 0.6465 | 0.8040 |
| No log        | 6.8889 | 434  | 0.6543          | 0.4721  | 0.6543 | 0.8089 |
| No log        | 6.9206 | 436  | 0.6854          | 0.4517  | 0.6854 | 0.8279 |
| No log        | 6.9524 | 438  | 0.7681          | 0.4469  | 0.7681 | 0.8764 |
| No log        | 6.9841 | 440  | 0.7654          | 0.4334  | 0.7654 | 0.8749 |
| No log        | 7.0159 | 442  | 0.6939          | 0.4734  | 0.6939 | 0.8330 |
| No log        | 7.0476 | 444  | 0.6613          | 0.6189  | 0.6613 | 0.8132 |
| No log        | 7.0794 | 446  | 0.6609          | 0.5481  | 0.6609 | 0.8129 |
| No log        | 7.1111 | 448  | 0.6274          | 0.5103  | 0.6274 | 0.7921 |
| No log        | 7.1429 | 450  | 0.6202          | 0.4770  | 0.6202 | 0.7875 |
| No log        | 7.1746 | 452  | 0.6173          | 0.4998  | 0.6173 | 0.7857 |
| No log        | 7.2063 | 454  | 0.6335          | 0.5161  | 0.6335 | 0.7960 |
| No log        | 7.2381 | 456  | 0.6657          | 0.5353  | 0.6657 | 0.8159 |
| No log        | 7.2698 | 458  | 0.6977          | 0.4681  | 0.6977 | 0.8353 |
| No log        | 7.3016 | 460  | 0.7087          | 0.4777  | 0.7087 | 0.8419 |
| No log        | 7.3333 | 462  | 0.6742          | 0.5609  | 0.6742 | 0.8211 |
| No log        | 7.3651 | 464  | 0.7652          | 0.4042  | 0.7652 | 0.8747 |
| No log        | 7.3968 | 466  | 0.9109          | 0.4307  | 0.9109 | 0.9544 |
| No log        | 7.4286 | 468  | 0.8905          | 0.4235  | 0.8905 | 0.9437 |
| No log        | 7.4603 | 470  | 0.6986          | 0.3803  | 0.6986 | 0.8358 |
| No log        | 7.4921 | 472  | 0.6264          | 0.4514  | 0.6264 | 0.7914 |
| No log        | 7.5238 | 474  | 0.7257          | 0.4218  | 0.7257 | 0.8519 |
| No log        | 7.5556 | 476  | 0.7820          | 0.3452  | 0.7820 | 0.8843 |
| No log        | 7.5873 | 478  | 0.7264          | 0.4166  | 0.7264 | 0.8523 |
| No log        | 7.6190 | 480  | 0.6675          | 0.4435  | 0.6675 | 0.8170 |
| No log        | 7.6508 | 482  | 0.6706          | 0.3904  | 0.6706 | 0.8189 |
| No log        | 7.6825 | 484  | 0.7181          | 0.4608  | 0.7181 | 0.8474 |
| No log        | 7.7143 | 486  | 0.7387          | 0.4539  | 0.7387 | 0.8595 |
| No log        | 7.7460 | 488  | 0.6951          | 0.4284  | 0.6951 | 0.8337 |
| No log        | 7.7778 | 490  | 0.7037          | 0.4408  | 0.7037 | 0.8389 |
| No log        | 7.8095 | 492  | 0.6978          | 0.4506  | 0.6978 | 0.8353 |
| No log        | 7.8413 | 494  | 0.6904          | 0.4809  | 0.6904 | 0.8309 |
| No log        | 7.8730 | 496  | 0.6728          | 0.4846  | 0.6728 | 0.8202 |
| No log        | 7.9048 | 498  | 0.6560          | 0.4642  | 0.6560 | 0.8099 |
| 0.3683        | 7.9365 | 500  | 0.6380          | 0.4690  | 0.6380 | 0.7988 |
| 0.3683        | 7.9683 | 502  | 0.6305          | 0.4690  | 0.6305 | 0.7940 |
| 0.3683        | 8.0    | 504  | 0.6386          | 0.5175  | 0.6386 | 0.7991 |
| 0.3683        | 8.0317 | 506  | 0.6675          | 0.4887  | 0.6675 | 0.8170 |
| 0.3683        | 8.0635 | 508  | 0.6431          | 0.5688  | 0.6431 | 0.8019 |
| 0.3683        | 8.0952 | 510  | 0.6678          | 0.4229  | 0.6678 | 0.8172 |
| 0.3683        | 8.1270 | 512  | 0.7706          | 0.3691  | 0.7706 | 0.8778 |
| 0.3683        | 8.1587 | 514  | 0.8158          | 0.3735  | 0.8158 | 0.9032 |
| 0.3683        | 8.1905 | 516  | 0.7496          | 0.3961  | 0.7496 | 0.8658 |
| 0.3683        | 8.2222 | 518  | 0.6468          | 0.4617  | 0.6468 | 0.8042 |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu118
- Datasets 2.21.0
- Tokenizers 0.19.1