File size: 4,159 Bytes
3f46e5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
79bb80f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
3f46e5c
 
 
 
 
 
 
 
 
 
 
 
 
79bb80f
 
 
3f46e5c
 
 
79bb80f
3f46e5c
 
 
79bb80f
 
 
 
 
 
 
3f46e5c
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
---
library_name: transformers
language:
- multilingual
- bn
- cs
- de
- en
- et
- fi
- fr
- gu
- ha
- hi
- is
- ja
- kk
- km
- lt
- lv
- pl
- ps
- ru
- ta
- tr
- uk
- xh
- zh
- zu
license: apache-2.0
base_model: answerdotai/ModernBERT-base
tags:
- quality-estimation
- regression
- generated_from_trainer
datasets:
- ymoslem/tokenized-wmt-da-human-evaluation
model-index:
- name: Quality Estimation for Machine Translation
  results:
  - task:
      type: regression
    dataset:
      name: ymoslem/wmt-da-human-evaluation-long-context
      type: QE
    metrics:
    - name: Pearson
      type: Pearson Correlation
      value: 0.4465
    - name: MAE
      type: Mean Absolute Error
      value: 0.126
    - name: RMSE
      type: Root Mean Squared Error
      value: 0.1623
    - name: R-R2
      type: R-Squared
      value: 0.0801
  - task:
      type: regression
    dataset:
      name: ymoslem/wmt-da-human-evaluation
      type: QE
    metrics:
    - name: Pearson
      type: Pearson Correlation
      value: 
    - name: MAE
      type: Mean Absolute Error
      value: 
    - name: RMSE
      type: Root Mean Squared Error
      value: 
    - name: R-R2
      type: R-Squared
      value: 
metrics:
- pearsonr
- mae
- r_squared
---

<!-- This model card has been generated automatically according to the information the Trainer had access to. You
should probably proofread and complete it, then remove this comment. -->

# Quality Estimation for Machine Translation

This model is a fine-tuned version of [answerdotai/ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) on the ymoslem/tokenized-wmt-da-human-evaluation dataset.
It achieves the following results on the evaluation set:
- Loss: 0.0571

## Model description

This model is for reference-free, sentence level quality estimation (QE) of machine translation (MT) systems.
The long-context / document-level model can be found at: [ModernBERT-base-long-context-qe-v1](https://huggingface.co/ymoslem/ModernBERT-base-long-context-qe-v1),
which is trained on a long-context / document-level QE dataset [ymoslem/wmt-da-human-evaluation-long-context](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation-long-context)

## Training and evaluation data

This model is trained on the sentence-level quality estimation dataset: [ymoslem/wmt-da-human-evaluation](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation)

## Training procedure

This version of the model uses the full lengthtokenizer.model_max_length=8192,
but it is still trained on a sentence-level QE dataset [ymoslem/wmt-da-human-evaluation](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation)

The long-context / document-level model can be found at: [ModernBERT-base-long-context-qe-v1](https://huggingface.co/ymoslem/ModernBERT-base-long-context-qe-v1),
which is trained on a long-context / document-level QE dataset [ymoslem/wmt-da-human-evaluation-long-context](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation-long-context)


### Training hyperparameters

The following hyperparameters were used during training:
- learning_rate: 0.0003
- train_batch_size: 128
- eval_batch_size: 128
- seed: 42
- optimizer: Use OptimizerNames.ADAMW_TORCH_FUSED with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- training_steps: 10000

### Training results

| Training Loss | Epoch  | Step  | Validation Loss |
|:-------------:|:------:|:-----:|:---------------:|
| 0.0686        | 0.1004 | 1000  | 0.0712          |
| 0.0652        | 0.2007 | 2000  | 0.0687          |
| 0.0648        | 0.3011 | 3000  | 0.0623          |
| 0.0609        | 0.4015 | 4000  | 0.0600          |
| 0.0585        | 0.5019 | 5000  | 0.0603          |
| 0.0588        | 0.6022 | 6000  | 0.0589          |
| 0.0592        | 0.7026 | 7000  | 0.0581          |
| 0.0585        | 0.8030 | 8000  | 0.0574          |
| 0.0588        | 0.9033 | 9000  | 0.0572          |
| 0.0563        | 1.0037 | 10000 | 0.0571          |


### Framework versions

- Transformers 4.48.1
- Pytorch 2.4.1+cu124
- Datasets 3.2.0
- Tokenizers 0.21.0