roberta-base-pp-1000000-1e-06-128-negcommonsensebalanced-1e-06-256

This model is a fine-tuned version of mhr2004/roberta-base-pp-1000000-1e-06-128 on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.4009

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 1e-06
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • num_epochs: 30

Training results

Training Loss Epoch Step Validation Loss
0.5718 1.0 795 0.5297
0.5233 2.0 1590 0.4949
0.4986 3.0 2385 0.4810
0.4843 4.0 3180 0.4702
0.4699 5.0 3975 0.4565
0.4601 6.0 4770 0.4506
0.454 7.0 5565 0.4388
0.4446 8.0 6360 0.4368
0.4353 9.0 7155 0.4323
0.4274 10.0 7950 0.4255
0.4219 11.0 8745 0.4221
0.4178 12.0 9540 0.4173
0.4128 13.0 10335 0.4210
0.4099 14.0 11130 0.4135
0.4071 15.0 11925 0.4088
0.3983 16.0 12720 0.4081
0.3979 17.0 13515 0.4089
0.3961 18.0 14310 0.4083
0.3946 19.0 15105 0.4054
0.3899 20.0 15900 0.4030
0.3907 21.0 16695 0.4034
0.3831 22.0 17490 0.4006
0.3846 23.0 18285 0.4018
0.385 24.0 19080 0.4023
0.3796 25.0 19875 0.4009

Framework versions

  • Transformers 4.49.0
  • Pytorch 2.6.0+cu124
  • Datasets 3.3.2
  • Tokenizers 0.21.0
Downloads last month
10
Safetensors
Model size
125M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for mhr2004/roberta-base-pp-1000000-1e-06-128-negcommonsensebalanced-1e-06-256

Finetuned
(1)
this model