--- license: mit language: - nl base_model: - pdelobelle/robbert-v2-dutch-base pipeline_tag: text-classification tags: - Robbert - Angry - finetune --- # Model Card for AngryBERT This model is a finetuning of [pdelobelle/robbert-v2-dutch-base](https://huggingface.co/pdelobelle/robbert-v2-dutch-base) for the classificaion of text as angry or non-angry. ## Model Details ### Model Description This model is a finetuning of [pdelobelle/robbert-v2-dutch-base](https://huggingface.co/pdelobelle/robbert-v2-dutch-base) on a selection of paragraphs mined from the Dutch novel "Ik ga leven" by Lale Gül. (Lale Gül,*Ik ga leven*. 2021. Amsterdam: Prometheus. ISBN 978-9044646870. An English translation of the novel exists: Lale Gül, *I Will Live*. 2023. London: Little, Brown Book Group. ISBN 978-1408716809). The intention of the model is to be able to classify sentences and paragraphs of the book as angry or non-angry. A selection of paragraph was annotated by two individual annotators for angriness (55 paragraphs, Cohen's Kappa of 0.48). - **Developed by:** Joris J. van Zundert and Julia Neugarten - **Funded by [optional]:** Huygens Institute - **Shared by [optional]:** {{ shared_by | default("[More Information Needed]", true)}} - **Model type:** text classification - **Language(s) (NLP):** Dutch - **License:** MIT - **Finetuned from model [optional]:** robbert-v2-dutch-base ## Uses This model should really **only** be used in the context of research towards the full text of the Dutch version of Lale Güls "Ik ga leven". Any other application is disadvised as the model has only been fine tuned on this specific novel. All results obtained with this model otherwise should be treated witht the greatest care and skeptism. ## Bias, Risks, and Limitations The model is biased towards the language of Lale Gül in her novel "Ik ga leven". This may include skew towards explicit and aggressive language. ### Recommendations This model should really **only** be used in the context of research towards the full text of the Dutch version of Lale Güls "Ik ga leven". Any other application is disadvised as the model has only been fine tuned on this specific novel. All results obtained with this model otherwise should be treated witht the greatest care and skeptism. ## How to Get Started with the Model Use the code below to get started with the model. ``` from transformers import RobertaTokenizer, RobertaForSequenceClassification from transformers import TextClassificationPipeline model = RobertaForSequenceClassification.from_pretrained( "./model/angryBERT-v1" ) tokenizer = RobertaTokenizer.from_pretrained( "./model/angryBERT-v1" ) # Just cheking if the model works # LABEL_1 means angry # LABEL-0 means non-angry input_text = "Ik was kwaad." # en.: "I was angry." pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True) pipe( input_text ) # => # [[{'label': 'LABEL_0', 'score': 0.026506226509809494}, # {'label': 'LABEL_1', 'score': 0.9734938144683838}]] ``` ## Training Details ### Training Data All paragraphs of Lale Gül's novel (Dutch) *Ik ga leven*. Paratext (copyright, title page, etc.) removed, also removed the section of poems at the back of the text. ### Training Procedure Trained on 55 paragraphs labeled as either angry (1) or non_angry (0). ## Model Card Authors [optional] Joris J. van Zundert, Julia Neugarten ## Model Card Contact [Joris J. van Zundert](https://huggingface.co/jorisvanzundert)