qwdf8591 commited on
Commit
beaad41
·
verified ·
1 Parent(s): ab9e415

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +63 -6
README.md CHANGED
@@ -3,6 +3,7 @@ license: mit
3
  base_model: roberta-base
4
  tags:
5
  - generated_from_trainer
 
6
  metrics:
7
  - accuracy
8
  - precision
@@ -11,6 +12,10 @@ metrics:
11
  model-index:
12
  - name: roberta-base_auditor_sentiment
13
  results: []
 
 
 
 
14
  ---
15
 
16
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
@@ -18,7 +23,8 @@ should probably proofread and complete it, then remove this comment. -->
18
 
19
  # roberta-base_auditor_sentiment
20
 
21
- This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on an unknown dataset.
 
22
  It achieves the following results on the evaluation set:
23
  - Loss: 0.5356
24
  - Accuracy: 0.8554
@@ -26,19 +32,70 @@ It achieves the following results on the evaluation set:
26
  - Recall: 0.8722
27
  - F1: 0.8414
28
 
 
 
 
 
 
 
 
 
 
 
 
 
29
  ## Model description
30
 
31
- More information needed
 
 
32
 
33
  ## Intended uses & limitations
34
 
35
- More information needed
 
 
 
 
 
 
 
 
 
 
36
 
37
  ## Training and evaluation data
38
 
39
- More information needed
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
40
 
41
- ## Training procedure
42
 
43
  ### Training hyperparameters
44
 
@@ -68,4 +125,4 @@ The following hyperparameters were used during training:
68
  - Transformers 4.42.3
69
  - Pytorch 2.1.2
70
  - Datasets 2.20.0
71
- - Tokenizers 0.19.1
 
3
  base_model: roberta-base
4
  tags:
5
  - generated_from_trainer
6
+ - finance
7
  metrics:
8
  - accuracy
9
  - precision
 
12
  model-index:
13
  - name: roberta-base_auditor_sentiment
14
  results: []
15
+ language:
16
+ - en
17
+ library_name: adapter-transformers
18
+ pipeline_tag: text-classification
19
  ---
20
 
21
  <!-- This model card has been generated automatically according to the information the Trainer had access to. You
 
23
 
24
  # roberta-base_auditor_sentiment
25
 
26
+ This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on [auditor_sentiment](https://huggingface.co/datasets/FinanceInc/auditor_sentiment?row=23) dataset.
27
+
28
  It achieves the following results on the evaluation set:
29
  - Loss: 0.5356
30
  - Accuracy: 0.8554
 
32
  - Recall: 0.8722
33
  - F1: 0.8414
34
 
35
+ ## Model demostraction
36
+
37
+ ```python
38
+ from transformers import pipeline
39
+
40
+ classifier = pipeline('text-classification',repository_id, device=0)
41
+
42
+ text = "Boomerang Boats had net sales of EUR 4.1 mn and it made an operating profit of EUR 0.4 mn in 2006 ."
43
+ result = classifier(text)
44
+ print(result)
45
+ ```
46
+
47
  ## Model description
48
 
49
+ This model, based on the RoBERTa architecture (roberta-base), is fine-tuned for a sentiment classification task specific to the finance sector. It is designed to
50
+ classify auditor reports into three sentiment categories: "negative", "neutral", and "positive". This capability can be crucial for financial analysis,
51
+ investment decision-making, and trend analysis in financial reports.
52
 
53
  ## Intended uses & limitations
54
 
55
+ ### Intended Uses
56
+
57
+ This model is intended for professionals and researchers working in the finance industry who require an automated tool to assess the sentiment conveyed in textual
58
+ data, specifically auditor reports. It can be integrated into financial analysis systems to provide quick insights into the sentiment trends, which can aid in
59
+ decision-making processes.
60
+
61
+ ### Limitations
62
+
63
+ - The model is specifically trained on a dataset from the finance domain and may not perform well on general text or texts from other domains.
64
+ - The sentiment is classified into only three categories, which might not capture more nuanced sentiments or specific financial jargon fully.
65
+ - Like all AI models, this model should be used as an aid, not a substitute for professional financial analysis.
66
 
67
  ## Training and evaluation data
68
 
69
+ ### Training Data
70
+
71
+ The model was trained on a proprietary dataset FinanceInc/auditor_sentiment sourced from Hugging Face datasets, which consists of labeled examples of auditor reports.
72
+ Each report is annotated with one of three sentiment labels: negative, neutral, and positive.
73
+
74
+ ### Evaluation Data
75
+
76
+ The evaluation was conducted using a split of the same dataset. The data was divided into training and validation sets with a sharding method to ensure a diverse
77
+ representation of samples in each set.
78
+
79
+ ## Training Procedure
80
+
81
+ The model was fine-tuned for 5 epochs with a batch size of 8 for both training and evaluation. An initial learning rate of 5e-5 was used with a warm-up step of 500
82
+ to prevent overfitting at the early stages of training. The best model was selected based on its performance on the validation set, and only the top two performing
83
+ models were saved to conserve disk space.
84
+
85
+ ## Evaluation Metrics
86
+
87
+ Evaluation metrics included accuracy, macro precision, macro recall, and macro F1-score, calculated after each epoch. These metrics helped monitor the model's
88
+ performance and ensure it generalized well beyond the training data.
89
+
90
+ ## Model Performance
91
+
92
+ The final model's performance on the test set will be reported in terms of accuracy, precision, recall, and F1-score to provide a comprehensive overview
93
+ of its predictive capabilities.
94
+
95
+ ## Model Status
96
+
97
+ This model is currently being evaluated in development.
98
 
 
99
 
100
  ### Training hyperparameters
101
 
 
125
  - Transformers 4.42.3
126
  - Pytorch 2.1.2
127
  - Datasets 2.20.0
128
+ - Tokenizers 0.19.1