qwdf8591
/

roberta-base_auditor_sentiment

@@ -3,6 +3,7 @@ license: mit
 base_model: roberta-base
 tags:
 - generated_from_trainer
 metrics:
 - accuracy
 - precision
@@ -11,6 +12,10 @@ metrics:
 model-index:
 - name: roberta-base_auditor_sentiment
   results: []
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,7 +23,8 @@ should probably proofread and complete it, then remove this comment. -->
 # roberta-base_auditor_sentiment
-This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.5356
 - Accuracy: 0.8554
@@ -26,19 +32,70 @@ It achieves the following results on the evaluation set:
 - Recall: 0.8722
 - F1: 0.8414
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters
@@ -68,4 +125,4 @@ The following hyperparameters were used during training:
 - Transformers 4.42.3
 - Pytorch 2.1.2
 - Datasets 2.20.0
-- Tokenizers 0.19.1

 base_model: roberta-base
 tags:
 - generated_from_trainer
+- finance
 metrics:
 - accuracy
 - precision
 model-index:
 - name: roberta-base_auditor_sentiment
   results: []
+language:
+- en
+library_name: adapter-transformers
+pipeline_tag: text-classification
 ---
 <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 # roberta-base_auditor_sentiment
+This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on [auditor_sentiment](https://huggingface.co/datasets/FinanceInc/auditor_sentiment?row=23) dataset.
 It achieves the following results on the evaluation set:
 - Loss: 0.5356
 - Accuracy: 0.8554
 - Recall: 0.8722
 - F1: 0.8414
+## Model demostraction
+```python
+from transformers import pipeline
+classifier = pipeline('text-classification',repository_id, device=0)
+text = "Boomerang Boats had net sales of EUR 4.1 mn and it made an operating profit of EUR 0.4 mn in 2006 ."
+result = classifier(text)
+print(result)
+```
 ## Model description
+This model, based on the RoBERTa architecture (roberta-base), is fine-tuned for a sentiment classification task specific to the finance sector. It is designed to
+classify auditor reports into three sentiment categories: "negative", "neutral", and "positive". This capability can be crucial for financial analysis,
+investment decision-making, and trend analysis in financial reports.
 ## Intended uses & limitations
+### Intended Uses
+This model is intended for professionals and researchers working in the finance industry who require an automated tool to assess the sentiment conveyed in textual
+data, specifically auditor reports. It can be integrated into financial analysis systems to provide quick insights into the sentiment trends, which can aid in
+decision-making processes.
+### Limitations
+- The model is specifically trained on a dataset from the finance domain and may not perform well on general text or texts from other domains.
+- The sentiment is classified into only three categories, which might not capture more nuanced sentiments or specific financial jargon fully.
+- Like all AI models, this model should be used as an aid, not a substitute for professional financial analysis.
 ## Training and evaluation data
+### Training Data
+The model was trained on a proprietary dataset FinanceInc/auditor_sentiment sourced from Hugging Face datasets, which consists of labeled examples of auditor reports.
+Each report is annotated with one of three sentiment labels: negative, neutral, and positive.
+### Evaluation Data
+The evaluation was conducted using a split of the same dataset. The data was divided into training and validation sets with a sharding method to ensure a diverse
+representation of samples in each set.
+## Training Procedure
+The model was fine-tuned for 5 epochs with a batch size of 8 for both training and evaluation. An initial learning rate of 5e-5 was used with a warm-up step of 500
+to prevent overfitting at the early stages of training. The best model was selected based on its performance on the validation set, and only the top two performing
+models were saved to conserve disk space.
+## Evaluation Metrics
+Evaluation metrics included accuracy, macro precision, macro recall, and macro F1-score, calculated after each epoch. These metrics helped monitor the model's
+performance and ensure it generalized well beyond the training data.
+## Model Performance
+The final model's performance on the test set will be reported in terms of accuracy, precision, recall, and F1-score to provide a comprehensive overview
+of its predictive capabilities.
+## Model Status
+This model is currently being evaluated in development.
 ### Training hyperparameters
 - Transformers 4.42.3
 - Pytorch 2.1.2
 - Datasets 2.20.0
+- Tokenizers 0.19.1