Update README.md
Browse files
README.md
CHANGED
@@ -74,14 +74,19 @@ It achieves the following results on the evaluation set:
|
|
74 |
|
75 |
## Model description
|
76 |
|
77 |
-
This model is for reference-free quality estimation (QE) of machine translation (MT) systems.
|
|
|
|
|
78 |
|
79 |
## Training and evaluation data
|
80 |
|
81 |
The model is trained on the long-context dataset [ymoslem/wmt-da-human-evaluation-long-context](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation-long-context).
|
|
|
|
|
|
|
82 |
|
83 |
-
* Training: 7.65 million long-context texts
|
84 |
-
* Test: 59,235 long-context texts
|
85 |
|
86 |
## Training procedure
|
87 |
|
|
|
74 |
|
75 |
## Model description
|
76 |
|
77 |
+
This model is for reference-free, long-context quality estimation (QE) of machine translation (MT) systems.
|
78 |
+
It trained on a dataset of texts of up to 32 sentences (64 sentences for the source and target).
|
79 |
+
Hence, this model is suitable for document-level quality estimation.
|
80 |
|
81 |
## Training and evaluation data
|
82 |
|
83 |
The model is trained on the long-context dataset [ymoslem/wmt-da-human-evaluation-long-context](https://huggingface.co/datasets/ymoslem/wmt-da-human-evaluation-long-context).
|
84 |
+
The used long-context / document-level dataset for Quality Estimation of Machine Translation is an augmented variant of the sentence-level WMT DA Human Evaluation dataset.
|
85 |
+
In addition to individual sentences, it contains augmentations of 2, 4, 8, 16, and 32 sentences, among each language pair `lp` and `domain`.
|
86 |
+
The `raw` column represents a weighted average of scores of augmented sentences using character lengths of `src` and `mt` as weights.
|
87 |
|
88 |
+
* Training data: 7.65 million long-context texts
|
89 |
+
* Test data: 59,235 long-context texts
|
90 |
|
91 |
## Training procedure
|
92 |
|