premsa commited on
Commit
11da66c
·
verified ·
1 Parent(s): adad01b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +34 -0
README.md CHANGED
@@ -1,3 +1,37 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ model base: https://huggingface.co/microsoft/mdeberta-v3-base
6
+
7
+ dataset: https://github.com/ramybaly/Article-Bias-Prediction
8
+
9
+
10
+ training parameters:
11
+ - devices: 2xH100
12
+ - batch_size: 100
13
+ - epochs: 5
14
+ - dropout: 0.05
15
+ - max_length: 512
16
+ - learning_rate: 3e-5
17
+ - warmup_steps: 100
18
+ - random_state: 239
19
+
20
+
21
+ training methodology:
22
+ - sanitize dataset following specific rule-set, utilize random split as provided in the dataset
23
+ - train on train split and evaluate on validation split in each epoch
24
+ - evaluate test split only on the model that performed best on validation loss
25
+
26
+ result summary:
27
+ - throughout the five training epochs, model of x epoch achieved the lowest validation loss of x
28
+ - on test split x epoch model achieved f1 score of x and a test loss of x
29
+
30
+ usage:
31
+
32
+ ```
33
+ model = AutoModelForSequenceClassification.from_pretrained("premsa/political-bias-prediction-allsides-mDeBERTa")
34
+ tokenizer = AutoTokenizer.from_pretrained(premsa/"premsa/political-bias-prediction-allsides-mDeBERTa")
35
+ nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
36
+ print(nlp("die massen werden von den medien kontrolliert."))
37
+ ```