calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40
mixed_precision_training: Native AMP

Training Loss	Epoch	Step	Validation Loss
3.6713	1.0	6	3.1066
2.6657	2.0	12	2.1494
1.9445	3.0	18	1.7102
1.6766	4.0	24	1.5843
1.5675	5.0	30	1.5400
1.514	6.0	36	1.4790
1.4359	7.0	42	1.3848
1.3846	8.0	48	1.3430
1.3429	9.0	54	1.3113
1.2695	10.0	60	1.2591
1.2346	11.0	66	1.1202
1.1799	12.0	72	1.1988
1.1394	13.0	78	1.0635
1.0823	14.0	84	1.0404
1.0411	15.0	90	1.0058
1.002	16.0	96	0.9632
0.9569	17.0	102	0.9427
0.946	18.0	108	0.8890
0.9083	19.0	114	0.8726
0.8994	20.0	120	0.8342
0.8852	21.0	126	0.8450
0.8647	22.0	132	0.7971
0.8379	23.0	138	0.7805
0.8139	24.0	144	0.7417
0.7895	25.0	150	0.7641
0.8015	26.0	156	0.7093
0.7586	27.0	162	0.7121
0.7653	28.0	168	0.7131
0.7452	29.0	174	0.6806
0.7385	30.0	180	0.6572
0.7004	31.0	186	0.6542
0.6924	32.0	192	0.6507
0.6934	33.0	198	0.6411
0.6803	34.0	204	0.6227
0.6655	35.0	210	0.6102
0.657	36.0	216	0.6125
0.6514	37.0	222	0.5946
0.6528	38.0	228	0.5912
0.6493	39.0	234	0.5893
0.6398	40.0	240	0.5857