Offensive language detection
Tasks
The model combines three classifiers for all three tasks of the OLID dataset [1].
- subtask a: OFF, NOT
- subtask b: TIN, UNT
- subtask c: IND, GRP, OTH
Trained with Flair NLP as a multi-task model.
Training data: Offensive Language Identification Dataset (OLID) V1.0 [1] Test data: test set from Semi-Supervised Dataset for Offensive Language Identification (SOLID) [2]
Citation
When using this model, please cite:
Gregor Wiedemann, Seid Muhie Yimam, and Chris Biemann. 2020. UHH-LT at SemEval-2020 Task 12: Fine-Tuning of Pre-Trained Transformer Networks for Offensive Language Detection. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1638–1644, Barcelona (online). International Committee for Computational Linguistics.
Evaluation scores
Evaluation was conducted on the SemEval 2020 Task 12 English test set. Thus, results can be compared to [3]
Task A
Results:
- F-score (micro) 0.9256
- F-score (macro) 0.9131
- Accuracy 0.9256
By class:
precision recall f1-score support
NOT 0.9922 0.9042 0.9461 2807
OFF 0.7976 0.9815 0.8800 1080
accuracy 0.9256 3887
macro avg 0.8949 0.9428 0.9131 3887
weighted avg 0.9381 0.9256 0.9278 3887
Task B
Results:
- F-score (micro) 0.7138
- F-score (macro) 0.6408
- Accuracy 0.7138
By class:
precision recall f1-score support
TIN 0.6826 0.9741 0.8027 850
UNT 0.8947 0.3269 0.4789 572
accuracy 0.7138 1422
macro avg 0.7887 0.6505 0.6408 1422
weighted avg 0.7679 0.7138 0.6724 1422
Task C
Results:
- F-score (micro) 0.8318
- F-score (macro) 0.6978
- Accuracy 0.8318
By class:
precision recall f1-score support
IND 0.8703 0.9483 0.9076 580
GRP 0.7216 0.6684 0.6940 190
OTH 0.7143 0.3750 0.4918 80
accuracy 0.8318 850
macro avg 0.7687 0.6639 0.6978 850
weighted avg 0.8223 0.8318 0.8207 850
References
[1] Marcos Zampieri, Shervin Malmasi, Preslav Nakov, Sara Rosenthal, Noura Farra, and Ritesh Kumar. 2019. Predicting the Type and Target of Offensive Posts in Social Media. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1415–1420, Minneapolis, Minnesota. Association for Computational Linguistics.
[2] Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Marcos Zampieri, and Preslav Nakov. 2021. SOLID: A Large-Scale Semi-Supervised Dataset for Offensive Language Identification. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 915–928, Online. Association for Computational Linguistics.
[3] Marcos Zampieri, Preslav Nakov, Sara Rosenthal, Pepa Atanasova, Georgi Karadzhov, Hamdy Mubarak, Leon Derczynski, Zeses Pitenis, and Çağrı Çöltekin. 2020. SemEval-2020 Task 12: Multilingual Offensive Language Identification in Social Media (OffensEval 2020). In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1425–1447, Barcelona (online). International Committee for Computational Linguistics.
- Downloads last month
- 0