premsa commited on
Commit
32b2651
·
verified ·
1 Parent(s): 1f67203

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +33 -0
README.md CHANGED
@@ -1,3 +1,36 @@
1
  ---
2
  license: apache-2.0
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
  ---
4
+
5
+ model base: https://huggingface.co/google-bert/bert-base-uncased
6
+
7
+ dataset: https://github.com/ramybaly/Article-Bias-Prediction
8
+
9
+
10
+ training parameters:
11
+ - batch_size: 100
12
+ - epochs: 5
13
+ - dropout: 0.05
14
+ - max_length: 512
15
+ - learning_rate: 3e-5
16
+ - warmup_steps: 100
17
+ - random_state: 239
18
+
19
+
20
+ training methodology:
21
+ - sanitize dataset following specific rule-set, utilize random split as provided in the dataset
22
+ - train on train split and evaluate on validation split in each epoch
23
+ - evaluate test split only on the model that performed best on validation loss
24
+
25
+ result summary:
26
+ - throughout the five training epochs, model of second epoch achieved the lowest validation loss of 0.3314
27
+ - on test split second epoch model achieved f1 score of 0.9041
28
+
29
+ usage:
30
+
31
+ ```
32
+ model = AutoModelForSequenceClassification.from_pretrained("premsa/political-bias-prediction-allsides-BERT")
33
+ tokenizer = AutoTokenizer.from_pretrained(premsa/"political-bias-prediction-allsides-BERT")
34
+ nlp = pipeline("text-classification", model=model, tokenizer=tokenizer)
35
+ print(nlp("the masses are controlled by media."))
36
+ ```