rebego
/

clasificador-tweets

Text Classification

Generated from Trainer

Inference Endpoints

Model card Files Files and versions Community

rebego commited on Dec 13, 2024

Commit

c833d9e

·

verified ·

1 Parent(s): 8dd0965

Update README.md

Files changed (1) hide show

README.md +40 -12

README.md CHANGED Viewed

@@ -11,8 +11,6 @@ model-index:
   results: []
 ---
-<!-- This model card has been generated automatically according to the information the Trainer had access to. You
-should probably proofread and complete it, then remove this comment. -->
 # clasificador-tweets
@@ -23,25 +21,55 @@ It achieves the following results on the evaluation set:
 ## Model description
-Este modelo ha sido entrenado para clasificar tweets en 7 categorías relacionadas con el ámbito laboral:
-- **Salario precario**
-- **Derechos laborales**
-- **Explotación laboral**
-- **Acoso laboral**
-- **Abuso de autoridad**
-- **Negligencia laboral**
-- **Oportunidad de empleo**
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
 ## Training procedure
 ### Training hyperparameters
 The following hyperparameters were used during training:

   results: []
 ---
 # clasificador-tweets
 ## Model description
+This model has been trained to classify tweets into 7 labor-related categories:
+- **Low salary**
+- **Labor rights**
+- **Labor explotaition**
+- **Workplace harasment**
+- **Abuse of authority**
+- **Workplace Negligence**
+- **Job opportunities**
+The model was trained using the dataset "somosnlp-hackathon-2022/es_tweets_laboral," which contains Spanish tweets classified into the 7 mentioned categories.
+The dataset has the following characteristics:
+- **Training set**: 184 tweets.
+- **Test set**: 47 tweets.
+-Columns:
+text: The tweet's text.
+intent: The tweet's category.
+entities: Additional information about the entities identified in the tweets.
+The tokenizer from "mrm8488/electricidad-base-discriminator" was used for tokenization.
 ## Intended uses & limitations
+Classification of tweets related to labor topics.
+The model's accuracy is approximately ~72%.
+It is designed to classify tweets in Spanish.
+The dataset is small (184 tweets for training), which may limit the model's generalization.
 ## Training and evaluation data
+The model was trained for **10 epochs** using accuracy as the evaluation metric. The results on the test set were as follows:
+**Loss**: 0.937
+**Accuracy**: 72.34%
+It should be noted that these results may vary across different runs due to the randomness inherent in model training.
 ## Training procedure
+The training was based on the Transformers library by HuggingFace.
 ### Training hyperparameters
 The following hyperparameters were used during training: