Abdulrahman Al-Ghamdi commited on
Commit
e06c224
ยท
verified ยท
1 Parent(s): 1c3de2e

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +50 -0
README.md CHANGED
@@ -13,3 +13,53 @@ base_model:
13
  - aubmindlab/bert-base-arabertv02
14
  pipeline_tag: text-classification
15
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
13
  - aubmindlab/bert-base-arabertv02
14
  pipeline_tag: text-classification
15
  ---
16
+
17
+ # ๐Ÿฝ๏ธ Arabic Restaurant Review Sentiment Analysis ๐Ÿš€
18
+
19
+ ## ๐Ÿ“Œ Overview
20
+ This project fine-tunes a **transformer-based model** to analyze sentiment in **Arabic restaurant reviews**.
21
+ We utilized **Hugging Faceโ€™s model training pipeline** and deployed the final model as an **interactive Gradio web app**.
22
+
23
+ ## ๐Ÿ“ฅ Data Collection
24
+ The dataset used for fine-tuning was sourced from **Hugging Face Datasets**, specifically:
25
+ [๐Ÿ“‚ Arabic Restaurant Reviews Dataset](https://huggingface.co/datasets/hadyelsahar/ar_res_reviews)
26
+ It contains **restaurant reviews in Arabic** labeled with sentiment polarity.
27
+
28
+ ## ๐Ÿ”„ Data Preparation
29
+ - **Cleaning & Normalization**:
30
+ - Removed non-Arabic text, special characters, and extra spaces.
31
+ - Normalized Arabic characters (e.g., `ุฅ, ุฃ, ุข โ†’ ุง`, `ุฉ โ†’ ู‡`).
32
+ - Downsampled positive reviews to balance the dataset.
33
+ - **Tokenization**:
34
+ - Used **AraBERT tokenizer** for efficient text processing.
35
+ - **Train-Test Split**:
36
+ - **80% Training** | **20% Testing**.
37
+
38
+ ## ๐Ÿ‹๏ธ Fine-Tuning & Results
39
+ The model was fine-tuned using **Hugging Face Transformers** on a dataset of restaurant reviews.
40
+
41
+ ### **๐Ÿ“Š Evaluation Metrics**
42
+ | Metric | Score |
43
+ |-------------|--------|
44
+ | **Eval Loss** | `0.5665` |
45
+ | **Accuracy** | `70.37%` |
46
+ | **Precision** | `70.36%` |
47
+ | **Recall** | `70.37%` |
48
+ | **F1-score** | `69.75%` |
49
+ | **Eval Runtime** | `11.5 sec` |
50
+
51
+ ## โš™๏ธ Training Parameters
52
+ ```python
53
+ training_args = TrainingArguments(
54
+ output_dir="./results",
55
+ evaluation_strategy="steps",
56
+ eval_steps=200,
57
+ per_device_train_batch_size=2,
58
+ per_device_eval_batch_size=2,
59
+ num_train_epochs=5,
60
+ weight_decay=0.01,
61
+ learning_rate=3e-5,
62
+ logging_steps=100,
63
+ fp16=True,
64
+ report_to="none"
65
+ )