calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

Model description

More information needed

More information needed

More information needed

The following hyperparameters were used during training:

learning_rate: 0.001
train_batch_size: 512
eval_batch_size: 512
seed: 42
optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: linear
num_epochs: 40

Training Loss	Epoch	Step	Validation Loss
3.4058	1.0	6	2.7645
2.3687	2.0	12	2.0460
1.8458	3.0	18	1.6800
1.6366	4.0	24	1.5984
1.5803	5.0	30	1.5635
1.5446	6.0	36	1.5391
1.5226	7.0	42	1.5339
1.5374	8.0	48	1.5383
1.5543	9.0	54	1.5519
1.5328	10.0	60	1.6631
1.5779	11.0	66	1.5271
1.5112	12.0	72	1.5244
1.5213	13.0	78	1.4900
1.5005	14.0	84	1.4802
1.4843	15.0	90	1.4984
1.4506	16.0	96	1.4163
1.4083	17.0	102	1.3781
1.3669	18.0	108	1.3816
1.3516	19.0	114	1.2972
1.293	20.0	120	1.2405
1.2	21.0	126	1.1689
1.1558	22.0	132	1.1402
1.1391	23.0	138	1.0921
1.0899	24.0	144	1.0230
1.0309	25.0	150	1.0130
1.0078	26.0	156	0.9680
0.987	27.0	162	0.9480
0.9573	28.0	168	0.9060
0.9286	29.0	174	0.8876
0.9278	30.0	180	0.8956
0.9261	31.0	186	0.8770
0.8869	32.0	192	0.8407
0.8674	33.0	198	0.8311
0.8522	34.0	204	0.8045
0.8397	35.0	210	0.7907
0.8271	36.0	216	0.7811
0.8157	37.0	222	0.7684
0.8234	38.0	228	0.7593
0.7975	39.0	234	0.7524
0.8024	40.0	240	0.7503