File size: 1,486 Bytes
6eede27
 
 
 
 
 
 
7ce8a18
 
 
 
96d0b82
7ce8a18
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
ff67e08
7ce8a18
ff67e08
 
 
7ce8a18
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
---
datasets:
- AfnanTS/Final_ArLAMA_DS_tokenized_for_ARBERTv2
language:
- ar
base_model:
- UBC-NLP/ARBERTv2
pipeline_tag: fill-mask
---


<img src="./Transparent.png" alt="Model Logo" width="30%" height="30%" align="right"/>

The **ArBERTV2_EL** model is a transformer-based Arabic language model fine-tuned using the Entity Linking (EL) task. This model leverages Knowledge Graphs (KGs) for intrinsic evaluation of Masked Language Modeling (MLM) models without directly evaluating the EL model. The EL task ensures that the model benefits from the incorporation of structured knowledge during pre-training.


## Uses

<!-- Address questions around how the model is intended to be used, including the foreseeable users of the model and those affected by the model. -->

### Direct Use


Filling masked tokens in Arabic text, particularly in contexts enriched with knowledge from KGs.


### Downstream Use

Can be further fine-tuned for Arabic NLP tasks that require semantic understanding, such as text classification or question answering.



## How to Get Started with the Model

```python
from transformers import pipeline
fill_mask = pipeline("fill-mask", model="AfnanTS/ArBERTV2_EL")
fill_mask("اللغة [MASK] مهمة جدا."
```

## Training Details

### Training Data

Trained on the ArLAMA dataset, which is designed to represent Knowledge Graphs in natural language.



### Training Procedure

Continued pre-training of the ArBERTv2 model using Entity Linking (EL) task.