Update README.md

beaad41 verified 6 months ago

4.91 kB

	---
	license: mit
	base_model: roberta-base
	tags:
	- generated_from_trainer
	- finance
	metrics:
	- accuracy
	- precision
	- recall
	- f1
	model-index:
	- name: roberta-base_auditor_sentiment
	results: []
	language:
	- en
	library_name: adapter-transformers
	pipeline_tag: text-classification
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# roberta-base_auditor_sentiment

	This model is a fine-tuned version of [roberta-base](https://huggingface.co/roberta-base) on [auditor_sentiment](https://huggingface.co/datasets/FinanceInc/auditor_sentiment?row=23) dataset.

	It achieves the following results on the evaluation set:
	- Loss: 0.5356
	- Accuracy: 0.8554
	- Precision: 0.8224
	- Recall: 0.8722
	- F1: 0.8414

	## Model demostraction

	```python
	from transformers import pipeline

	classifier = pipeline('text-classification',repository_id, device=0)

	text = "Boomerang Boats had net sales of EUR 4.1 mn and it made an operating profit of EUR 0.4 mn in 2006 ."
	result = classifier(text)
	print(result)
	```

	## Model description

	This model, based on the RoBERTa architecture (roberta-base), is fine-tuned for a sentiment classification task specific to the finance sector. It is designed to
	classify auditor reports into three sentiment categories: "negative", "neutral", and "positive". This capability can be crucial for financial analysis,
	investment decision-making, and trend analysis in financial reports.

	## Intended uses & limitations

	### Intended Uses

	This model is intended for professionals and researchers working in the finance industry who require an automated tool to assess the sentiment conveyed in textual
	data, specifically auditor reports. It can be integrated into financial analysis systems to provide quick insights into the sentiment trends, which can aid in
	decision-making processes.

	### Limitations

	- The model is specifically trained on a dataset from the finance domain and may not perform well on general text or texts from other domains.
	- The sentiment is classified into only three categories, which might not capture more nuanced sentiments or specific financial jargon fully.
	- Like all AI models, this model should be used as an aid, not a substitute for professional financial analysis.

	## Training and evaluation data

	### Training Data

	The model was trained on a proprietary dataset FinanceInc/auditor_sentiment sourced from Hugging Face datasets, which consists of labeled examples of auditor reports.
	Each report is annotated with one of three sentiment labels: negative, neutral, and positive.

	### Evaluation Data

	The evaluation was conducted using a split of the same dataset. The data was divided into training and validation sets with a sharding method to ensure a diverse
	representation of samples in each set.

	## Training Procedure

	The model was fine-tuned for 5 epochs with a batch size of 8 for both training and evaluation. An initial learning rate of 5e-5 was used with a warm-up step of 500
	to prevent overfitting at the early stages of training. The best model was selected based on its performance on the validation set, and only the top two performing
	models were saved to conserve disk space.

	## Evaluation Metrics

	Evaluation metrics included accuracy, macro precision, macro recall, and macro F1-score, calculated after each epoch. These metrics helped monitor the model's
	performance and ensure it generalized well beyond the training data.

	## Model Performance

	The final model's performance on the test set will be reported in terms of accuracy, precision, recall, and F1-score to provide a comprehensive overview
	of its predictive capabilities.

	## Model Status

	This model is currently being evaluated in development.


	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 8
	- eval_batch_size: 8
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 5

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| Precision \| Recall \| F1 \|
	\|:-------------:\|:-----:\|:----:\|:---------------:\|:--------:\|:---------:\|:------:\|:------:\|
	\| 0.4217 \| 1.0 \| 485 \| 0.6358 \| 0.8223 \| 0.8142 \| 0.8135 \| 0.8134 \|
	\| 0.6538 \| 2.0 \| 970 \| 0.6491 \| 0.8388 \| 0.8584 \| 0.8025 \| 0.8192 \|
	\| 0.3961 \| 3.0 \| 1455 \| 0.5356 \| 0.8554 \| 0.8224 \| 0.8722 \| 0.8414 \|
	\| 0.1121 \| 4.0 \| 1940 \| 0.7393 \| 0.8512 \| 0.8414 \| 0.8477 \| 0.8428 \|
	\| 0.0192 \| 5.0 \| 2425 \| 0.7233 \| 0.8698 \| 0.8581 \| 0.8743 \| 0.8657 \|


	### Framework versions

	- Transformers 4.42.3
	- Pytorch 2.1.2
	- Datasets 2.20.0
	- Tokenizers 0.19.1