calculator_model_test

This model is a fine-tuned version of on the None dataset. It achieves the following results on the evaluation set:

  • Loss: 0.7503

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.001
  • train_batch_size: 512
  • eval_batch_size: 512
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 40

Training results

Training Loss Epoch Step Validation Loss
3.4058 1.0 6 2.7645
2.3687 2.0 12 2.0460
1.8458 3.0 18 1.6800
1.6366 4.0 24 1.5984
1.5803 5.0 30 1.5635
1.5446 6.0 36 1.5391
1.5226 7.0 42 1.5339
1.5374 8.0 48 1.5383
1.5543 9.0 54 1.5519
1.5328 10.0 60 1.6631
1.5779 11.0 66 1.5271
1.5112 12.0 72 1.5244
1.5213 13.0 78 1.4900
1.5005 14.0 84 1.4802
1.4843 15.0 90 1.4984
1.4506 16.0 96 1.4163
1.4083 17.0 102 1.3781
1.3669 18.0 108 1.3816
1.3516 19.0 114 1.2972
1.293 20.0 120 1.2405
1.2 21.0 126 1.1689
1.1558 22.0 132 1.1402
1.1391 23.0 138 1.0921
1.0899 24.0 144 1.0230
1.0309 25.0 150 1.0130
1.0078 26.0 156 0.9680
0.987 27.0 162 0.9480
0.9573 28.0 168 0.9060
0.9286 29.0 174 0.8876
0.9278 30.0 180 0.8956
0.9261 31.0 186 0.8770
0.8869 32.0 192 0.8407
0.8674 33.0 198 0.8311
0.8522 34.0 204 0.8045
0.8397 35.0 210 0.7907
0.8271 36.0 216 0.7811
0.8157 37.0 222 0.7684
0.8234 38.0 228 0.7593
0.7975 39.0 234 0.7524
0.8024 40.0 240 0.7503

Framework versions

  • Transformers 4.48.3
  • Pytorch 2.5.1+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
11
Safetensors
Model size
7.8M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.