File size: 21,939 Bytes
ca6314d
 
 
 
 
 
9b967c5
ca6314d
 
 
 
 
 
9b967c5
ca6314d
 
 
9b967c5
 
 
 
ca6314d
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
9b967c5
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ca6314d
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
---
library_name: transformers
base_model: aubmindlab/bert-base-arabertv02
tags:
- generated_from_trainer
model-index:
- name: ArabicNewSplits7_OSS_usingWellWrittenEssays_FineTuningAraBERT_run1_AugV5_k11_task1_organization
  results: []
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# ArabicNewSplits7_OSS_usingWellWrittenEssays_FineTuningAraBERT_run1_AugV5_k11_task1_organization

This model is a fine-tuned version of [aubmindlab/bert-base-arabertv02](https://huggingface.co/aubmindlab/bert-base-arabertv02) on the None dataset.
It achieves the following results on the evaluation set:
- Loss: 0.8464
- Qwk: 0.6755
- Mse: 0.8464
- Rmse: 0.9200

## Model description

More information needed

## Intended uses & limitations

More information needed

## Training and evaluation data

More information needed

## Training procedure

### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 2e-05
- train_batch_size: 8
- eval_batch_size: 8
- seed: 42
- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
- lr_scheduler_type: linear
- num_epochs: 100

### Training results

| Training Loss | Epoch  | Step | Validation Loss | Qwk     | Mse    | Rmse   |
|:-------------:|:------:|:----:|:---------------:|:-------:|:------:|:------:|
| No log        | 0.0244 | 2    | 7.6165          | -0.0256 | 7.6165 | 2.7598 |
| No log        | 0.0488 | 4    | 4.9018          | 0.0519  | 4.9018 | 2.2140 |
| No log        | 0.0732 | 6    | 3.3316          | 0.0860  | 3.3316 | 1.8253 |
| No log        | 0.0976 | 8    | 2.0935          | 0.2462  | 2.0935 | 1.4469 |
| No log        | 0.1220 | 10   | 1.8309          | 0.1495  | 1.8309 | 1.3531 |
| No log        | 0.1463 | 12   | 1.6642          | 0.1538  | 1.6642 | 1.2900 |
| No log        | 0.1707 | 14   | 1.6474          | 0.1165  | 1.6474 | 1.2835 |
| No log        | 0.1951 | 16   | 1.5831          | 0.0784  | 1.5831 | 1.2582 |
| No log        | 0.2195 | 18   | 1.4829          | 0.2075  | 1.4829 | 1.2177 |
| No log        | 0.2439 | 20   | 1.4671          | 0.1905  | 1.4671 | 1.2112 |
| No log        | 0.2683 | 22   | 1.4567          | 0.2364  | 1.4567 | 1.2069 |
| No log        | 0.2927 | 24   | 1.3252          | 0.2857  | 1.3252 | 1.1512 |
| No log        | 0.3171 | 26   | 1.1934          | 0.2857  | 1.1934 | 1.0924 |
| No log        | 0.3415 | 28   | 1.1724          | 0.4211  | 1.1724 | 1.0828 |
| No log        | 0.3659 | 30   | 1.1860          | 0.4348  | 1.1860 | 1.0890 |
| No log        | 0.3902 | 32   | 1.0245          | 0.5512  | 1.0245 | 1.0122 |
| No log        | 0.4146 | 34   | 1.0117          | 0.544   | 1.0117 | 1.0058 |
| No log        | 0.4390 | 36   | 1.2305          | 0.5000  | 1.2305 | 1.1093 |
| No log        | 0.4634 | 38   | 1.7426          | 0.2381  | 1.7426 | 1.3201 |
| No log        | 0.4878 | 40   | 1.9963          | 0.2794  | 1.9963 | 1.4129 |
| No log        | 0.5122 | 42   | 1.9076          | 0.3571  | 1.9076 | 1.3811 |
| No log        | 0.5366 | 44   | 1.8613          | 0.3803  | 1.8613 | 1.3643 |
| No log        | 0.5610 | 46   | 1.3423          | 0.4706  | 1.3423 | 1.1586 |
| No log        | 0.5854 | 48   | 1.0859          | 0.5     | 1.0859 | 1.0421 |
| No log        | 0.6098 | 50   | 0.9909          | 0.6047  | 0.9909 | 0.9954 |
| No log        | 0.6341 | 52   | 1.0012          | 0.6154  | 1.0012 | 1.0006 |
| No log        | 0.6585 | 54   | 1.0941          | 0.5865  | 1.0941 | 1.0460 |
| No log        | 0.6829 | 56   | 1.1556          | 0.5857  | 1.1556 | 1.0750 |
| No log        | 0.7073 | 58   | 1.0671          | 0.5692  | 1.0671 | 1.0330 |
| No log        | 0.7317 | 60   | 0.9920          | 0.6154  | 0.9920 | 0.9960 |
| No log        | 0.7561 | 62   | 1.0068          | 0.6466  | 1.0068 | 1.0034 |
| No log        | 0.7805 | 64   | 1.1685          | 0.5735  | 1.1685 | 1.0810 |
| No log        | 0.8049 | 66   | 1.1926          | 0.5827  | 1.1926 | 1.0921 |
| No log        | 0.8293 | 68   | 1.1983          | 0.5344  | 1.1983 | 1.0947 |
| No log        | 0.8537 | 70   | 1.1960          | 0.5147  | 1.1960 | 1.0936 |
| No log        | 0.8780 | 72   | 1.2415          | 0.5571  | 1.2415 | 1.1142 |
| No log        | 0.9024 | 74   | 1.1956          | 0.5652  | 1.1956 | 1.0934 |
| No log        | 0.9268 | 76   | 1.1764          | 0.5547  | 1.1764 | 1.0846 |
| No log        | 0.9512 | 78   | 1.2871          | 0.5072  | 1.2871 | 1.1345 |
| No log        | 0.9756 | 80   | 1.3635          | 0.4853  | 1.3635 | 1.1677 |
| No log        | 1.0    | 82   | 1.2743          | 0.5455  | 1.2743 | 1.1288 |
| No log        | 1.0244 | 84   | 1.1181          | 0.5649  | 1.1181 | 1.0574 |
| No log        | 1.0488 | 86   | 1.0023          | 0.5865  | 1.0023 | 1.0011 |
| No log        | 1.0732 | 88   | 0.9497          | 0.6471  | 0.9497 | 0.9745 |
| No log        | 1.0976 | 90   | 0.9949          | 0.6074  | 0.9949 | 0.9974 |
| No log        | 1.1220 | 92   | 1.0148          | 0.6061  | 1.0148 | 1.0074 |
| No log        | 1.1463 | 94   | 1.1205          | 0.5839  | 1.1205 | 1.0585 |
| No log        | 1.1707 | 96   | 1.0986          | 0.5778  | 1.0986 | 1.0481 |
| No log        | 1.1951 | 98   | 1.0460          | 0.5649  | 1.0460 | 1.0227 |
| No log        | 1.2195 | 100  | 1.0453          | 0.6119  | 1.0453 | 1.0224 |
| No log        | 1.2439 | 102  | 1.1340          | 0.5931  | 1.1340 | 1.0649 |
| No log        | 1.2683 | 104  | 1.1541          | 0.5957  | 1.1541 | 1.0743 |
| No log        | 1.2927 | 106  | 1.0553          | 0.6111  | 1.0553 | 1.0273 |
| No log        | 1.3171 | 108  | 1.0141          | 0.6081  | 1.0141 | 1.0070 |
| No log        | 1.3415 | 110  | 0.9003          | 0.6892  | 0.9003 | 0.9489 |
| No log        | 1.3659 | 112  | 0.9160          | 0.6792  | 0.9160 | 0.9571 |
| No log        | 1.3902 | 114  | 0.8782          | 0.6795  | 0.8782 | 0.9371 |
| No log        | 1.4146 | 116  | 0.8832          | 0.6573  | 0.8832 | 0.9398 |
| No log        | 1.4390 | 118  | 1.1524          | 0.6375  | 1.1524 | 1.0735 |
| No log        | 1.4634 | 120  | 1.3371          | 0.625   | 1.3371 | 1.1563 |
| No log        | 1.4878 | 122  | 1.2287          | 0.6506  | 1.2287 | 1.1085 |
| No log        | 1.5122 | 124  | 0.9339          | 0.6364  | 0.9339 | 0.9664 |
| No log        | 1.5366 | 126  | 0.7977          | 0.6515  | 0.7977 | 0.8931 |
| No log        | 1.5610 | 128  | 0.8250          | 0.6617  | 0.8250 | 0.9083 |
| No log        | 1.5854 | 130  | 0.8447          | 0.6519  | 0.8447 | 0.9191 |
| No log        | 1.6098 | 132  | 0.8952          | 0.6715  | 0.8952 | 0.9461 |
| No log        | 1.6341 | 134  | 0.9761          | 0.6043  | 0.9761 | 0.9880 |
| No log        | 1.6585 | 136  | 1.1902          | 0.5839  | 1.1902 | 1.0910 |
| No log        | 1.6829 | 138  | 1.2350          | 0.5714  | 1.2350 | 1.1113 |
| No log        | 1.7073 | 140  | 1.2820          | 0.5795  | 1.2820 | 1.1323 |
| No log        | 1.7317 | 142  | 1.3277          | 0.5650  | 1.3277 | 1.1523 |
| No log        | 1.7561 | 144  | 1.2380          | 0.6024  | 1.2380 | 1.1127 |
| No log        | 1.7805 | 146  | 1.0942          | 0.6099  | 1.0942 | 1.0460 |
| No log        | 1.8049 | 148  | 0.8698          | 0.6107  | 0.8698 | 0.9326 |
| No log        | 1.8293 | 150  | 0.8496          | 0.6202  | 0.8496 | 0.9218 |
| No log        | 1.8537 | 152  | 0.8618          | 0.625   | 0.8618 | 0.9283 |
| No log        | 1.8780 | 154  | 0.9836          | 0.6767  | 0.9836 | 0.9917 |
| No log        | 1.9024 | 156  | 1.1339          | 0.5594  | 1.1339 | 1.0648 |
| No log        | 1.9268 | 158  | 1.1433          | 0.6627  | 1.1433 | 1.0693 |
| No log        | 1.9512 | 160  | 0.9125          | 0.7066  | 0.9125 | 0.9553 |
| No log        | 1.9756 | 162  | 0.9019          | 0.6933  | 0.9019 | 0.9497 |
| No log        | 2.0    | 164  | 0.8926          | 0.6933  | 0.8926 | 0.9448 |
| No log        | 2.0244 | 166  | 0.9515          | 0.6842  | 0.9515 | 0.9755 |
| No log        | 2.0488 | 168  | 1.0819          | 0.6369  | 1.0819 | 1.0402 |
| No log        | 2.0732 | 170  | 1.0885          | 0.5890  | 1.0885 | 1.0433 |
| No log        | 2.0976 | 172  | 0.8934          | 0.6809  | 0.8934 | 0.9452 |
| No log        | 2.1220 | 174  | 0.8617          | 0.7050  | 0.8617 | 0.9283 |
| No log        | 2.1463 | 176  | 0.8625          | 0.6765  | 0.8625 | 0.9287 |
| No log        | 2.1707 | 178  | 0.9199          | 0.6316  | 0.9199 | 0.9591 |
| No log        | 2.1951 | 180  | 0.9402          | 0.5865  | 0.9402 | 0.9696 |
| No log        | 2.2195 | 182  | 0.9922          | 0.625   | 0.9922 | 0.9961 |
| No log        | 2.2439 | 184  | 1.2184          | 0.6386  | 1.2184 | 1.1038 |
| No log        | 2.2683 | 186  | 1.3164          | 0.6592  | 1.3164 | 1.1473 |
| No log        | 2.2927 | 188  | 1.0609          | 0.6667  | 1.0609 | 1.0300 |
| No log        | 2.3171 | 190  | 0.7818          | 0.6301  | 0.7818 | 0.8842 |
| No log        | 2.3415 | 192  | 0.7664          | 0.6528  | 0.7664 | 0.8755 |
| No log        | 2.3659 | 194  | 0.7689          | 0.6423  | 0.7689 | 0.8769 |
| No log        | 2.3902 | 196  | 1.0288          | 0.6538  | 1.0288 | 1.0143 |
| No log        | 2.4146 | 198  | 1.2157          | 0.5926  | 1.2157 | 1.1026 |
| No log        | 2.4390 | 200  | 1.0564          | 0.6345  | 1.0564 | 1.0278 |
| No log        | 2.4634 | 202  | 0.9115          | 0.6131  | 0.9115 | 0.9547 |
| No log        | 2.4878 | 204  | 0.8351          | 0.6618  | 0.8351 | 0.9138 |
| No log        | 2.5122 | 206  | 0.8553          | 0.6618  | 0.8553 | 0.9248 |
| No log        | 2.5366 | 208  | 0.8209          | 0.6569  | 0.8209 | 0.9060 |
| No log        | 2.5610 | 210  | 0.7635          | 0.6714  | 0.7635 | 0.8738 |
| No log        | 2.5854 | 212  | 0.8193          | 0.6761  | 0.8193 | 0.9051 |
| No log        | 2.6098 | 214  | 1.0372          | 0.6486  | 1.0372 | 1.0184 |
| No log        | 2.6341 | 216  | 1.1746          | 0.6145  | 1.1746 | 1.0838 |
| No log        | 2.6585 | 218  | 1.0083          | 0.6711  | 1.0083 | 1.0041 |
| No log        | 2.6829 | 220  | 0.7721          | 0.6569  | 0.7721 | 0.8787 |
| No log        | 2.7073 | 222  | 0.7570          | 0.6866  | 0.7570 | 0.8701 |
| No log        | 2.7317 | 224  | 0.7902          | 0.6715  | 0.7902 | 0.8889 |
| No log        | 2.7561 | 226  | 0.8144          | 0.6423  | 0.8144 | 0.9024 |
| No log        | 2.7805 | 228  | 0.7433          | 0.6715  | 0.7433 | 0.8622 |
| No log        | 2.8049 | 230  | 0.7336          | 0.6957  | 0.7336 | 0.8565 |
| No log        | 2.8293 | 232  | 0.7767          | 0.6429  | 0.7767 | 0.8813 |
| No log        | 2.8537 | 234  | 0.8307          | 0.6429  | 0.8307 | 0.9114 |
| No log        | 2.8780 | 236  | 0.7697          | 0.6857  | 0.7697 | 0.8773 |
| No log        | 2.9024 | 238  | 0.6946          | 0.6950  | 0.6946 | 0.8334 |
| No log        | 2.9268 | 240  | 0.6784          | 0.6944  | 0.6784 | 0.8237 |
| No log        | 2.9512 | 242  | 0.7771          | 0.7081  | 0.7771 | 0.8815 |
| No log        | 2.9756 | 244  | 0.9807          | 0.6946  | 0.9807 | 0.9903 |
| No log        | 3.0    | 246  | 0.9268          | 0.6443  | 0.9268 | 0.9627 |
| No log        | 3.0244 | 248  | 0.8223          | 0.6763  | 0.8223 | 0.9068 |
| No log        | 3.0488 | 250  | 0.7872          | 0.7153  | 0.7872 | 0.8873 |
| No log        | 3.0732 | 252  | 0.8334          | 0.6761  | 0.8334 | 0.9129 |
| No log        | 3.0976 | 254  | 1.0119          | 0.6541  | 1.0119 | 1.0059 |
| No log        | 3.1220 | 256  | 1.1239          | 0.6429  | 1.1239 | 1.0601 |
| No log        | 3.1463 | 258  | 1.0480          | 0.6824  | 1.0480 | 1.0237 |
| No log        | 3.1707 | 260  | 0.9033          | 0.6790  | 0.9033 | 0.9504 |
| No log        | 3.1951 | 262  | 0.8150          | 0.6429  | 0.8150 | 0.9028 |
| No log        | 3.2195 | 264  | 0.8084          | 0.6522  | 0.8084 | 0.8991 |
| No log        | 3.2439 | 266  | 0.8299          | 0.6377  | 0.8299 | 0.9110 |
| No log        | 3.2683 | 268  | 0.8824          | 0.6519  | 0.8824 | 0.9394 |
| No log        | 3.2927 | 270  | 0.8832          | 0.6528  | 0.8832 | 0.9398 |
| No log        | 3.3171 | 272  | 0.8433          | 0.6389  | 0.8433 | 0.9183 |
| No log        | 3.3415 | 274  | 0.8276          | 0.6301  | 0.8276 | 0.9097 |
| No log        | 3.3659 | 276  | 0.7838          | 0.6119  | 0.7838 | 0.8853 |
| No log        | 3.3902 | 278  | 0.7763          | 0.6912  | 0.7763 | 0.8811 |
| No log        | 3.4146 | 280  | 0.9018          | 0.6914  | 0.9018 | 0.9496 |
| No log        | 3.4390 | 282  | 1.0291          | 0.6786  | 1.0291 | 1.0144 |
| No log        | 3.4634 | 284  | 1.0116          | 0.6786  | 1.0116 | 1.0058 |
| No log        | 3.4878 | 286  | 0.8722          | 0.6918  | 0.8722 | 0.9339 |
| No log        | 3.5122 | 288  | 0.7671          | 0.7347  | 0.7671 | 0.8759 |
| No log        | 3.5366 | 290  | 0.7363          | 0.7034  | 0.7363 | 0.8581 |
| No log        | 3.5610 | 292  | 0.7529          | 0.6846  | 0.7529 | 0.8677 |
| No log        | 3.5854 | 294  | 0.7677          | 0.6950  | 0.7677 | 0.8762 |
| No log        | 3.6098 | 296  | 0.7775          | 0.6713  | 0.7775 | 0.8818 |
| No log        | 3.6341 | 298  | 0.8949          | 0.6667  | 0.8949 | 0.9460 |
| No log        | 3.6585 | 300  | 1.1345          | 0.6538  | 1.1345 | 1.0651 |
| No log        | 3.6829 | 302  | 1.1516          | 0.5793  | 1.1516 | 1.0731 |
| No log        | 3.7073 | 304  | 0.9584          | 0.6269  | 0.9584 | 0.9790 |
| No log        | 3.7317 | 306  | 0.8727          | 0.6357  | 0.8727 | 0.9342 |
| No log        | 3.7561 | 308  | 0.8492          | 0.5760  | 0.8492 | 0.9215 |
| No log        | 3.7805 | 310  | 0.8615          | 0.6202  | 0.8615 | 0.9282 |
| No log        | 3.8049 | 312  | 1.0254          | 0.64    | 1.0254 | 1.0126 |
| No log        | 3.8293 | 314  | 1.3184          | 0.5943  | 1.3184 | 1.1482 |
| No log        | 3.8537 | 316  | 1.3347          | 0.5909  | 1.3347 | 1.1553 |
| No log        | 3.8780 | 318  | 1.2526          | 0.6279  | 1.2526 | 1.1192 |
| No log        | 3.9024 | 320  | 1.0347          | 0.6289  | 1.0347 | 1.0172 |
| No log        | 3.9268 | 322  | 0.8660          | 0.6154  | 0.8660 | 0.9306 |
| No log        | 3.9512 | 324  | 0.8792          | 0.6154  | 0.8792 | 0.9377 |
| No log        | 3.9756 | 326  | 0.9155          | 0.6154  | 0.9155 | 0.9568 |
| No log        | 4.0    | 328  | 0.9295          | 0.5954  | 0.9295 | 0.9641 |
| No log        | 4.0244 | 330  | 0.8827          | 0.6061  | 0.8827 | 0.9395 |
| No log        | 4.0488 | 332  | 0.8483          | 0.6324  | 0.8483 | 0.9210 |
| No log        | 4.0732 | 334  | 0.9358          | 0.6267  | 0.9358 | 0.9674 |
| No log        | 4.0976 | 336  | 0.8839          | 0.6624  | 0.8839 | 0.9401 |
| No log        | 4.1220 | 338  | 0.8292          | 0.6792  | 0.8292 | 0.9106 |
| No log        | 4.1463 | 340  | 0.7474          | 0.7114  | 0.7474 | 0.8646 |
| No log        | 4.1707 | 342  | 0.7325          | 0.7338  | 0.7325 | 0.8558 |
| No log        | 4.1951 | 344  | 0.7360          | 0.7101  | 0.7360 | 0.8579 |
| No log        | 4.2195 | 346  | 0.7580          | 0.6861  | 0.7580 | 0.8707 |
| No log        | 4.2439 | 348  | 0.8582          | 0.6575  | 0.8582 | 0.9264 |
| No log        | 4.2683 | 350  | 0.9010          | 0.6490  | 0.9010 | 0.9492 |
| No log        | 4.2927 | 352  | 0.8493          | 0.6331  | 0.8493 | 0.9216 |
| No log        | 4.3171 | 354  | 0.8848          | 0.6316  | 0.8848 | 0.9406 |
| No log        | 4.3415 | 356  | 0.9025          | 0.6074  | 0.9025 | 0.9500 |
| No log        | 4.3659 | 358  | 0.9107          | 0.6176  | 0.9107 | 0.9543 |
| No log        | 4.3902 | 360  | 0.9632          | 0.6099  | 0.9632 | 0.9814 |
| No log        | 4.4146 | 362  | 0.9728          | 0.5931  | 0.9728 | 0.9863 |
| No log        | 4.4390 | 364  | 0.9274          | 0.6400  | 0.9274 | 0.9630 |
| No log        | 4.4634 | 366  | 0.8939          | 0.6528  | 0.8939 | 0.9455 |
| No log        | 4.4878 | 368  | 0.8161          | 0.6522  | 0.8161 | 0.9034 |
| No log        | 4.5122 | 370  | 0.8077          | 0.6364  | 0.8077 | 0.8987 |
| No log        | 4.5366 | 372  | 0.8089          | 0.6308  | 0.8089 | 0.8994 |
| No log        | 4.5610 | 374  | 0.8337          | 0.6716  | 0.8337 | 0.9130 |
| No log        | 4.5854 | 376  | 1.0006          | 0.6309  | 1.0006 | 1.0003 |
| No log        | 4.6098 | 378  | 1.1286          | 0.6289  | 1.1286 | 1.0624 |
| No log        | 4.6341 | 380  | 1.1534          | 0.6548  | 1.1534 | 1.0740 |
| No log        | 4.6585 | 382  | 0.9705          | 0.6452  | 0.9705 | 0.9852 |
| No log        | 4.6829 | 384  | 0.7981          | 0.6519  | 0.7981 | 0.8934 |
| No log        | 4.7073 | 386  | 0.7707          | 0.6917  | 0.7707 | 0.8779 |
| No log        | 4.7317 | 388  | 0.7582          | 0.6917  | 0.7582 | 0.8708 |
| No log        | 4.7561 | 390  | 0.7701          | 0.6906  | 0.7701 | 0.8776 |
| No log        | 4.7805 | 392  | 0.9190          | 0.6875  | 0.9190 | 0.9587 |
| No log        | 4.8049 | 394  | 1.0723          | 0.6471  | 1.0723 | 1.0355 |
| No log        | 4.8293 | 396  | 1.0019          | 0.7093  | 1.0019 | 1.0010 |
| No log        | 4.8537 | 398  | 0.8132          | 0.7186  | 0.8132 | 0.9018 |
| No log        | 4.8780 | 400  | 0.7199          | 0.6849  | 0.7199 | 0.8485 |
| No log        | 4.9024 | 402  | 0.7359          | 0.7059  | 0.7359 | 0.8579 |
| No log        | 4.9268 | 404  | 0.7706          | 0.6818  | 0.7706 | 0.8778 |
| No log        | 4.9512 | 406  | 0.7969          | 0.6718  | 0.7969 | 0.8927 |
| No log        | 4.9756 | 408  | 0.9296          | 0.6525  | 0.9296 | 0.9642 |
| No log        | 5.0    | 410  | 1.1808          | 0.5802  | 1.1808 | 1.0866 |
| No log        | 5.0244 | 412  | 1.3288          | 0.5488  | 1.3288 | 1.1527 |
| No log        | 5.0488 | 414  | 1.2418          | 0.5034  | 1.2418 | 1.1144 |
| No log        | 5.0732 | 416  | 1.1527          | 0.5954  | 1.1527 | 1.0736 |
| No log        | 5.0976 | 418  | 1.0989          | 0.6412  | 1.0989 | 1.0483 |
| No log        | 5.1220 | 420  | 1.1049          | 0.6364  | 1.1049 | 1.0511 |
| No log        | 5.1463 | 422  | 1.1341          | 0.5833  | 1.1341 | 1.0650 |
| No log        | 5.1707 | 424  | 1.1316          | 0.6259  | 1.1316 | 1.0638 |
| No log        | 5.1951 | 426  | 1.1481          | 0.6242  | 1.1481 | 1.0715 |
| No log        | 5.2195 | 428  | 1.0562          | 0.6405  | 1.0562 | 1.0277 |
| No log        | 5.2439 | 430  | 0.8647          | 0.6383  | 0.8647 | 0.9299 |
| No log        | 5.2683 | 432  | 0.7759          | 0.7101  | 0.7759 | 0.8808 |
| No log        | 5.2927 | 434  | 0.7609          | 0.6861  | 0.7609 | 0.8723 |
| No log        | 5.3171 | 436  | 0.7920          | 0.6853  | 0.7920 | 0.8899 |
| No log        | 5.3415 | 438  | 0.7818          | 0.7     | 0.7818 | 0.8842 |
| No log        | 5.3659 | 440  | 0.7644          | 0.7194  | 0.7644 | 0.8743 |
| No log        | 5.3902 | 442  | 0.7782          | 0.7234  | 0.7782 | 0.8821 |
| No log        | 5.4146 | 444  | 0.7950          | 0.7143  | 0.7950 | 0.8916 |
| No log        | 5.4390 | 446  | 0.8467          | 0.7143  | 0.8467 | 0.9201 |
| No log        | 5.4634 | 448  | 0.8668          | 0.6765  | 0.8668 | 0.9310 |
| No log        | 5.4878 | 450  | 0.8624          | 0.6667  | 0.8624 | 0.9286 |
| No log        | 5.5122 | 452  | 0.8755          | 0.6667  | 0.8755 | 0.9357 |
| No log        | 5.5366 | 454  | 0.9514          | 0.6395  | 0.9514 | 0.9754 |
| No log        | 5.5610 | 456  | 0.9966          | 0.6093  | 0.9966 | 0.9983 |
| No log        | 5.5854 | 458  | 1.0195          | 0.6335  | 1.0195 | 1.0097 |
| No log        | 5.6098 | 460  | 0.9353          | 0.6667  | 0.9353 | 0.9671 |
| No log        | 5.6341 | 462  | 0.7721          | 0.7183  | 0.7721 | 0.8787 |
| No log        | 5.6585 | 464  | 0.7060          | 0.7429  | 0.7060 | 0.8402 |
| No log        | 5.6829 | 466  | 0.6737          | 0.7299  | 0.6737 | 0.8208 |
| No log        | 5.7073 | 468  | 0.6647          | 0.7429  | 0.6647 | 0.8153 |
| No log        | 5.7317 | 470  | 0.7131          | 0.7310  | 0.7131 | 0.8445 |
| No log        | 5.7561 | 472  | 0.7636          | 0.7531  | 0.7636 | 0.8739 |
| No log        | 5.7805 | 474  | 0.7277          | 0.7355  | 0.7277 | 0.8531 |
| No log        | 5.8049 | 476  | 0.6934          | 0.7273  | 0.6934 | 0.8327 |
| No log        | 5.8293 | 478  | 0.7459          | 0.7183  | 0.7459 | 0.8637 |
| No log        | 5.8537 | 480  | 0.8362          | 0.6906  | 0.8362 | 0.9144 |
| No log        | 5.8780 | 482  | 0.8880          | 0.6906  | 0.8880 | 0.9423 |
| No log        | 5.9024 | 484  | 0.8536          | 0.6906  | 0.8536 | 0.9239 |
| No log        | 5.9268 | 486  | 0.8010          | 0.7     | 0.8010 | 0.8950 |
| No log        | 5.9512 | 488  | 0.8100          | 0.6901  | 0.8100 | 0.9000 |
| No log        | 5.9756 | 490  | 0.8426          | 0.6986  | 0.8426 | 0.9179 |
| No log        | 6.0    | 492  | 0.8930          | 0.6846  | 0.8930 | 0.9450 |
| No log        | 6.0244 | 494  | 0.8467          | 0.7     | 0.8467 | 0.9202 |
| No log        | 6.0488 | 496  | 0.8101          | 0.7     | 0.8101 | 0.9000 |
| No log        | 6.0732 | 498  | 0.7571          | 0.7101  | 0.7571 | 0.8701 |
| 0.4196        | 6.0976 | 500  | 0.7238          | 0.7518  | 0.7238 | 0.8507 |
| 0.4196        | 6.1220 | 502  | 0.7168          | 0.7015  | 0.7168 | 0.8466 |
| 0.4196        | 6.1463 | 504  | 0.7218          | 0.6667  | 0.7218 | 0.8496 |
| 0.4196        | 6.1707 | 506  | 0.7034          | 0.6912  | 0.7034 | 0.8387 |
| 0.4196        | 6.1951 | 508  | 0.7578          | 0.6857  | 0.7578 | 0.8705 |
| 0.4196        | 6.2195 | 510  | 0.8464          | 0.6755  | 0.8464 | 0.9200 |


### Framework versions

- Transformers 4.44.2
- Pytorch 2.4.0+cu118
- Datasets 2.21.0
- Tokenizers 0.19.1