calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
2.7931	1.0	6	2.1113
1.8077	2.0	12	1.4058
1.1644	3.0	18	0.9666
0.8817	4.0	24	0.7780
0.7403	5.0	30	0.7428
0.7086	6.0	36	0.6427
0.6289	7.0	42	0.5918
0.5774	8.0	48	0.5336
0.5304	9.0	54	0.4841
0.4572	10.0	60	0.4076
0.4071	11.0	66	0.3833
0.3504	12.0	72	0.3248
0.3101	13.0	78	0.2868
0.2649	14.0	84	0.2062
0.2075	15.0	90	0.1612
0.1667	16.0	96	0.1267
0.1327	17.0	102	0.0917
0.1027	18.0	108	0.0700
0.0871	19.0	114	0.0644
0.0785	20.0	120	0.0544
0.077	21.0	126	0.0449
0.0588	22.0	132	0.0333
0.0459	23.0	138	0.0280
0.0418	24.0	144	0.0217
0.0347	25.0	150	0.0180
0.0325	26.0	156	0.0179
0.0293	27.0	162	0.0148
0.0271	28.0	168	0.0117
0.0223	29.0	174	0.0129
0.0196	30.0	180	0.0099
0.0164	31.0	186	0.0087
0.0154	32.0	192	0.0082
0.013	33.0	198	0.0075
0.0123	34.0	204	0.0070
0.0123	35.0	210	0.0068
0.0112	36.0	216	0.0066
0.0125	37.0	222	0.0065
0.0111	38.0	228	0.0065
0.0122	39.0	234	0.0065
0.0107	40.0	240	0.0064