dejanseo commited on
Commit
dd2b9e9
·
verified ·
1 Parent(s): c3b2d22

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +86 -37
README.md CHANGED
@@ -1,43 +1,92 @@
1
  ---
 
2
  license: other
3
  license_name: link-attribution
4
  license_link: https://dejanmarketing.com/link-attribution/
 
 
5
  ---
6
 
7
- # Model Card: dejanseo/ecommerce-taxonomy-classifier
8
-
9
- ## Model Description
10
- **dejanseo/ecommerce-taxonomy-classifier** is a multi-level text classification model designed to categorize ecommerce product descriptions (or similar text) into a hierarchical taxonomy. It uses a pretrained ALBERT (albert-base-v2) backbone and a custom classification head that leverages parent-level one-hot encodings for deeper levels in the taxonomy.
11
-
12
- ### Model Architecture
13
- - **Base Model**: [ALBERT (albert-base-v2)](https://huggingface.co/albert-base-v2)
14
- - **Classification Head**: A linear layer (or multi-layer head) on top of the ALBERT pooled output, concatenated with a parent-level one-hot vector representing the higher-level class.
15
-
16
- ### Intended Use
17
- - **Primary Application**: Categorizing product descriptions in online retail or marketplace scenarios.
18
- - **Potential Use Cases**:
19
- - E-commerce product listing
20
- - Product categorization for inventory management
21
- - Enriching product feeds for better search/discovery
22
-
23
- ### How to Use
24
- 1. **Installation**: Install `transformers`, `torch`, etc.
25
- 2. **Pipeline Example**:
26
- ```python
27
- from transformers import AlbertTokenizer, AlbertForSequenceClassification
28
- import torch
29
-
30
- # Load tokenizer and model from the Hugging Face Hub
31
- tokenizer = AlbertTokenizer.from_pretrained("dejanseo/ecommerce-taxonomy-classifier")
32
- model = AlbertForSequenceClassification.from_pretrained("dejanseo/ecommerce-taxonomy-classifier")
33
- model.eval()
34
-
35
- text = "Experience the magic of music with the Clavinova CLP-800 series digital pianos."
36
- inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
37
-
38
- with torch.no_grad():
39
- outputs = model(**inputs)
40
- logits = outputs.logits
41
- predicted_class = torch.argmax(logits, dim=-1).item()
42
-
43
- print("Predicted Class:", predicted_class)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language: en
3
  license: other
4
  license_name: link-attribution
5
  license_link: https://dejanmarketing.com/link-attribution/
6
+ model_name: Taxonomy Classifier
7
+ pipeline_tag: text-classification
8
  ---
9
 
10
+ # Taxonomy Classifier
11
+
12
+ This model is a hierarchical text classifier designed to categorize text into a 7-level taxonomy. It utilizes a chain of models, where the prediction at each level informs the prediction at the subsequent level. This approach reduces the classification space at each step.
13
+
14
+ ## Model Details
15
+
16
+ - **Model Developers:** You
17
+ - **Model Type:** Hierarchical Text Classification
18
+ - **Base Model:** [`albert/albert-base-v2`](https://huggingface.co/albert/albert-base-v2)
19
+ - **Model Architecture:**
20
+ - **Level 1:** Standard sequence classification using `AlbertForSequenceClassification`.
21
+ - **Levels 2-7:** Custom architecture (`TaxonomyClassifier`) where the ALBERT pooled output is concatenated with a one-hot encoded representation of the predicted ID from the previous level before being fed into a linear classification layer.
22
+ - **Language(s):** English
23
+ - **Library:** [Transformers](https://huggingface.co/docs/transformers/index)
24
+ - **License:** [link-attribution](https://dejanmarketing.com/link-attribution/)
25
+
26
+ ## Uses
27
+
28
+ ### Direct Use
29
+
30
+ The model is intended for categorizing text into a predefined 7-level taxonomy.
31
+
32
+ ### Downstream Uses
33
+
34
+ Potential applications include:
35
+
36
+ - Automated content tagging
37
+ - Product categorization
38
+ - Information organization
39
+
40
+ ### Out-of-Scope Use
41
+
42
+ The model's performance on text outside the domain of the training data or for classifying into taxonomies with different structures is not guaranteed.
43
+
44
+ ## Limitations
45
+
46
+ - Performance is dependent on the quality and coverage of the training data.
47
+ - Errors in earlier levels of the hierarchy can propagate to subsequent levels.
48
+ - The model's performance on unseen categories is limited.
49
+ - The model may exhibit biases present in the training data.
50
+ - The reliance on one-hot encoding for parent IDs can lead to high-dimensional input features at deeper levels, potentially impacting training efficiency and performance (especially observed at Level 4).
51
+
52
+ ## Training Data
53
+
54
+ The model was trained on a dataset of 374,521 samples. Each row in the training data represents a full taxonomy path from the root level to a leaf node.
55
+
56
+ ## Training Procedure
57
+
58
+ - **Levels:** Seven separate models were trained, one for each level of the taxonomy.
59
+ - **Level 1 Training:** Trained as a standard sequence classification task.
60
+ - **Levels 2-7 Training:** Trained with a custom architecture incorporating the predicted parent ID.
61
+ - **Input Format:**
62
+ - **Level 1:** Text response.
63
+ - **Levels 2-7:** Text response concatenated with a one-hot encoded vector of the predicted ID from the previous level.
64
+ - **Objective Function:** CrossEntropyLoss
65
+ - **Optimizer:** AdamW
66
+ - **Learning Rate:** Initially 5e-5, adjusted to 1e-5 for Level 4.
67
+ - **Training Hyperparameters:**
68
+ - **Epochs:** 10
69
+ - **Validation Split:** 0.1
70
+ - **Validation Frequency:** Every 1000 steps
71
+ - **Batch Size:** 38
72
+ - **Max Sequence Length:** 512
73
+ - **Early Stopping Patience:** 3
74
+
75
+ ## Evaluation
76
+
77
+ Validation loss was used as the primary evaluation metric during training. The following validation loss trends were observed:
78
+
79
+ - **Level 1, 2, and 3:** Showed a relatively rapid decrease in validation loss during training.
80
+ - **Level 4:** Exhibited a slower decrease in validation loss, potentially due to the significant increase in the dimensionality of the parent ID one-hot encoding.
81
+
82
+ Further evaluation on downstream tasks is recommended to assess the model's practical performance.
83
+
84
+ ## How to Use
85
+
86
+ Inference can be performed using the provided Streamlit application.
87
+
88
+ 1. **Input Text:** Enter the text you want to classify.
89
+ 2. **Select Checkpoints:** Choose the desired checkpoint for each level's model. Checkpoints are saved in the respective `level{n}` directories (e.g., `level1/model` or `level4/level4_step31000`).
90
+ 3. **Run Inference:** Click the "Run Inference" button.
91
+
92
+ The application will output the predicted ID and the corresponding text description for each level of the taxonomy, based on the provided `mapping.csv` file.