Abdu-GH
/

AraRest-Arabic-Restaurant-Reviews-Sentiment-Analysis

Text Classification

sentiment-analysis

Inference Endpoints

Model card Files Files and versions Community

Abdulrahman Al-Ghamdi commited on Jan 25

Commit

e06c224

·

verified ·

1 Parent(s): 1c3de2e

Update README.md

Files changed (1) hide show

README.md +50 -0

README.md CHANGED Viewed

@@ -13,3 +13,53 @@ base_model:
 - aubmindlab/bert-base-arabertv02
 pipeline_tag: text-classification
 ---

 - aubmindlab/bert-base-arabertv02
 pipeline_tag: text-classification
 ---
+# 🍽️ Arabic Restaurant Review Sentiment Analysis 🚀
+## 📌 Overview
+This project fine-tunes a **transformer-based model** to analyze sentiment in **Arabic restaurant reviews**.
+We utilized **Hugging Face’s model training pipeline** and deployed the final model as an **interactive Gradio web app**.
+## 📥 Data Collection
+The dataset used for fine-tuning was sourced from **Hugging Face Datasets**, specifically:
+[📂 Arabic Restaurant Reviews Dataset](https://huggingface.co/datasets/hadyelsahar/ar_res_reviews)
+It contains **restaurant reviews in Arabic** labeled with sentiment polarity.
+## 🔄 Data Preparation
+- **Cleaning & Normalization**:
+  - Removed non-Arabic text, special characters, and extra spaces.
+  - Normalized Arabic characters (e.g., `إ, أ, آ → ا`, `ة → ه`).
+  - Downsampled positive reviews to balance the dataset.
+- **Tokenization**:
+  - Used **AraBERT tokenizer** for efficient text processing.
+- **Train-Test Split**:
+  - **80% Training** | **20% Testing**.
+## 🏋️ Fine-Tuning & Results
+The model was fine-tuned using **Hugging Face Transformers** on a dataset of restaurant reviews.
+### **📊 Evaluation Metrics**
+| Metric       | Score  |
+|-------------|--------|
+| **Eval Loss** | `0.5665` |
+| **Accuracy**  | `70.37%` |
+| **Precision** | `70.36%` |
+| **Recall**    | `70.37%` |
+| **F1-score**  | `69.75%` |
+| **Eval Runtime** | `11.5 sec` |
+## ⚙️ Training Parameters
+```python
+training_args = TrainingArguments(
+    output_dir="./results",
+    evaluation_strategy="steps",
+    eval_steps=200,
+    per_device_train_batch_size=2,
+    per_device_eval_batch_size=2,
+    num_train_epochs=5,
+    weight_decay=0.01,
+    learning_rate=3e-5,
+    logging_steps=100,
+    fp16=True,
+    report_to="none"
+)