ymoslem's picture
End of training
c04bed3 verified
|
raw
history blame
4.9 kB
metadata
library_name: transformers
language:
  - multilingual
  - bn
  - cs
  - de
  - en
  - et
  - fi
  - fr
  - gu
  - ha
  - hi
  - is
  - ja
  - kk
  - km
  - lt
  - lv
  - pl
  - ps
  - ru
  - ta
  - tr
  - uk
  - xh
  - zh
  - zu
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
  - quality-estimation
  - regression
  - generated_from_trainer
datasets:
  - ymoslem/wmt-da-human-evaluation-long-context
model-index:
  - name: Quality Estimation for Machine Translation
    results: []

Quality Estimation for Machine Translation

This model is a fine-tuned version of answerdotai/ModernBERT-base on the ymoslem/wmt-da-human-evaluation-long-context dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0214

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0003
  • train_batch_size: 128
  • eval_batch_size: 128
  • seed: 42
  • optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 60000

Training results

Training Loss Epoch Step Validation Loss
0.0233 0.0167 1000 0.0233
0.0232 0.0335 2000 0.0230
0.0225 0.0502 3000 0.0230
0.023 0.0669 4000 0.0224
0.0226 0.0837 5000 0.0223
0.0226 0.1004 6000 0.0225
0.0219 0.1171 7000 0.0222
0.022 0.1339 8000 0.0222
0.0213 0.1506 9000 0.0221
0.0213 0.1673 10000 0.0220
0.0218 0.1840 11000 0.0219
0.0215 0.2008 12000 0.0225
0.0218 0.2175 13000 0.0219
0.0218 0.2342 14000 0.0218
0.0217 0.2510 15000 0.0219
0.0219 0.2677 16000 0.0219
0.0212 0.2844 17000 0.0219
0.0219 0.3012 18000 0.0219
0.0218 0.3179 19000 0.0219
0.0213 0.3346 20000 0.0217
0.0218 0.3514 21000 0.0217
0.021 0.3681 22000 0.0217
0.0219 0.3848 23000 0.0220
0.0211 0.4016 24000 0.0216
0.0211 0.4183 25000 0.0216
0.0206 0.4350 26000 0.0216
0.021 0.4517 27000 0.0215
0.0214 0.4685 28000 0.0215
0.0214 0.4852 29000 0.0216
0.0204 0.5019 30000 0.0216
0.022 0.5187 31000 0.0216
0.0212 0.5354 32000 0.0217
0.0211 0.5521 33000 0.0216
0.0208 0.5689 34000 0.0215
0.0208 0.5856 35000 0.0215
0.0215 0.6023 36000 0.0215
0.0212 0.6191 37000 0.0215
0.0213 0.6358 38000 0.0215
0.0211 0.6525 39000 0.0215
0.0208 0.6693 40000 0.0215
0.0205 0.6860 41000 0.0215
0.0209 0.7027 42000 0.0215
0.021 0.7194 43000 0.0215
0.0207 0.7362 44000 0.0215
0.0197 0.7529 45000 0.0215
0.0211 0.7696 46000 0.0214
0.021 0.7864 47000 0.0215
0.0207 0.8031 48000 0.0214
0.0219 0.8198 49000 0.0215
0.0208 0.8366 50000 0.0215
0.0202 0.8533 51000 0.0215
0.02 0.8700 52000 0.0215
0.0205 0.8868 53000 0.0214
0.0214 0.9035 54000 0.0215
0.0205 0.9202 55000 0.0214
0.0209 0.9370 56000 0.0214
0.0206 0.9537 57000 0.0214
0.0204 0.9704 58000 0.0214
0.0203 0.9872 59000 0.0214
0.0209 1.0039 60000 0.0214

Framework versions

  • Transformers 4.48.1
  • Pytorch 2.4.1+cu124
  • Datasets 3.2.0
  • Tokenizers 0.21.0