tiny-rubert / README.md

0x7o/rubert-tiny-sensitive-topics

4c73926 verified 10 months ago

4.18 kB

	---
	license: mit
	base_model: cointegrated/rubert-tiny2
	tags:
	- generated_from_trainer
	metrics:
	- accuracy
	- f1
	- precision
	- recall
	model-index:
	- name: tiny-rubert
	results: []
	---

	<!-- This model card has been generated automatically according to the information the Trainer had access to. You
	should probably proofread and complete it, then remove this comment. -->

	# tiny-rubert

	This model is a fine-tuned version of [cointegrated/rubert-tiny2](https://huggingface.co/cointegrated/rubert-tiny2) on an unknown dataset.
	It achieves the following results on the evaluation set:
	- Loss: 2.5730
	- Accuracy: 0.4956
	- F1: 0.6380
	- Precision: 0.7116
	- Recall: 0.5873

	## Model description

	More information needed

	## Intended uses & limitations

	More information needed

	## Training and evaluation data

	More information needed

	## Training procedure

	### Training hyperparameters

	The following hyperparameters were used during training:
	- learning_rate: 5e-05
	- train_batch_size: 16
	- eval_batch_size: 16
	- seed: 42
	- optimizer: Adam with betas=(0.9,0.999) and epsilon=1e-08
	- lr_scheduler_type: linear
	- lr_scheduler_warmup_steps: 500
	- num_epochs: 10

	### Training results

	\| Training Loss \| Epoch \| Step \| Validation Loss \| Accuracy \| F1 \| Precision \| Recall \|
	\|:-------------:\|:------:\|:-----:\|:---------------:\|:--------:\|:------:\|:---------:\|:------:\|
	\| No log \| 0.2569 \| 500 \| 4.0706 \| 0.0551 \| 0.0257 \| 0.0397 \| 0.0556 \|
	\| 4.3917 \| 0.5139 \| 1000 \| 3.3738 \| 0.2871 \| 0.2256 \| 0.4551 \| 0.2810 \|
	\| 2.4111 \| 0.7708 \| 1500 \| 3.0120 \| 0.4041 \| 0.4675 \| 0.6818 \| 0.4345 \|
	\| 1.6023 \| 1.0277 \| 2000 \| 2.8194 \| 0.4454 \| 0.5570 \| 0.7232 \| 0.4930 \|
	\| 1.2666 \| 1.2847 \| 2500 \| 2.7362 \| 0.4553 \| 0.5615 \| 0.7195 \| 0.5 \|
	\| 1.0944 \| 1.5416 \| 3000 \| 2.6636 \| 0.4513 \| 0.5783 \| 0.7227 \| 0.5106 \|
	\| 1.0944 \| 1.7986 \| 3500 \| 2.5940 \| 0.4543 \| 0.5842 \| 0.7290 \| 0.5134 \|
	\| 1.0351 \| 2.0555 \| 4000 \| 2.5506 \| 0.4690 \| 0.5953 \| 0.7435 \| 0.5254 \|
	\| 0.9259 \| 2.3124 \| 4500 \| 2.5396 \| 0.4474 \| 0.5780 \| 0.7272 \| 0.5127 \|
	\| 0.802 \| 2.5694 \| 5000 \| 2.4499 \| 0.4680 \| 0.6044 \| 0.7420 \| 0.5310 \|
	\| 0.7777 \| 2.8263 \| 5500 \| 2.4295 \| 0.4661 \| 0.5902 \| 0.7239 \| 0.5232 \|
	\| 0.7247 \| 3.0832 \| 6000 \| 2.4434 \| 0.4631 \| 0.5880 \| 0.7245 \| 0.5197 \|
	\| 0.7247 \| 3.3402 \| 6500 \| 2.4479 \| 0.4769 \| 0.6023 \| 0.7401 \| 0.5352 \|
	\| 0.6062 \| 3.5971 \| 7000 \| 2.4713 \| 0.4720 \| 0.6076 \| 0.7465 \| 0.5359 \|
	\| 0.6207 \| 3.8541 \| 7500 \| 2.4590 \| 0.4779 \| 0.6020 \| 0.7284 \| 0.5359 \|
	\| 0.6021 \| 4.1110 \| 8000 \| 2.4468 \| 0.4926 \| 0.6333 \| 0.7359 \| 0.5676 \|
	\| 0.4891 \| 4.3679 \| 8500 \| 2.4930 \| 0.4848 \| 0.6232 \| 0.7313 \| 0.5599 \|
	\| 0.4983 \| 4.6249 \| 9000 \| 2.4374 \| 0.4936 \| 0.6249 \| 0.7239 \| 0.5676 \|
	\| 0.4983 \| 4.8818 \| 9500 \| 2.4792 \| 0.4956 \| 0.6246 \| 0.7208 \| 0.5648 \|
	\| 0.4789 \| 5.1387 \| 10000 \| 2.5257 \| 0.4897 \| 0.6355 \| 0.7117 \| 0.5845 \|
	\| 0.4353 \| 5.3957 \| 10500 \| 2.5430 \| 0.4946 \| 0.6358 \| 0.7276 \| 0.5761 \|
	\| 0.3995 \| 5.6526 \| 11000 \| 2.5579 \| 0.4887 \| 0.6340 \| 0.7188 \| 0.5782 \|
	\| 0.4005 \| 5.9096 \| 11500 \| 2.5249 \| 0.4828 \| 0.6305 \| 0.7014 \| 0.5824 \|
	\| 0.3774 \| 6.1665 \| 12000 \| 2.6100 \| 0.4838 \| 0.6295 \| 0.7194 \| 0.5725 \|
	\| 0.3774 \| 6.4234 \| 12500 \| 2.5730 \| 0.4956 \| 0.6380 \| 0.7116 \| 0.5873 \|
	\| 0.3502 \| 6.6804 \| 13000 \| 2.6117 \| 0.4916 \| 0.6358 \| 0.7066 \| 0.5880 \|
	\| 0.3562 \| 6.9373 \| 13500 \| 2.6457 \| 0.4956 \| 0.6373 \| 0.7185 \| 0.5838 \|
	\| 0.3453 \| 7.1942 \| 14000 \| 2.6547 \| 0.4848 \| 0.6316 \| 0.7062 \| 0.5810 \|
	\| 0.3213 \| 7.4512 \| 14500 \| 2.6828 \| 0.4877 \| 0.6258 \| 0.7035 \| 0.5746 \|


	### Framework versions

	- Transformers 4.40.1
	- Pytorch 2.2.1+cu121
	- Tokenizers 0.19.1