Abdu-GH
/

AraRest-Arabic-Restaurant-Reviews-Sentiment-Analysis

@@ -1,3 +1,6 @@
 ---
 license: apache-2.0
 datasets:
@@ -9,95 +12,78 @@ metrics:
 - precision
 - recall
 - f1
-base_model: aubmindlab/bert-base-arabertv02
 pipeline_tag: text-classification
 tags:
-- text-classification
-- sentiment-analysis
 - arabic
-- restaurant-reviews
-model-index:
-- name: ArabReview-Sentiment
-  results:
-  - task:
-      type: text-classification
-    dataset:
-      name: hadyelsahar/ar_res_reviews
-      type: sentiment-analysis
-    metrics:
-    - name: Accuracy
-      type: accuracy
-      value: 86.41
-    - name: Precision
-      type: precision
-      value: 87.01
-    - name: Recall
-      type: recall
-      value: 86.49
-    - name: F1 Score
-      type: f1
-      value: 86.75
-library_name: transformers
 ---
-# 🍽️ Arabic Restaurant Review Sentiment Analysis 🚀
-## 📌 Overview
-This project fine-tunes **AraBERT** to analyze sentiment in **Arabic restaurant reviews**.
-We leveraged **Hugging Face’s `transformers` library** for training and deployed the model as an **interactive pipeline**.
-## 📥 Dataset
-The dataset used for fine-tuning is from:
-[📂 Arabic Restaurant Reviews Dataset](https://huggingface.co/datasets/hadyelsahar/ar_res_reviews)
-It contains restaurant reviews labeled as **Positive** or **Negative**.
-## 🔄 Preprocessing
-- **Cleaning & Normalization**:
-  - Removed **non-Arabic** text, special characters, and extra spaces.
-  - **Normalized Arabic characters** (e.g., `إ, أ, آ → ا`, `ة → ه`).
 - **Tokenization**:
-  - Used **AraBERT tokenizer** for efficient processing.
-- **Data Balancing**:
-  - 2,418 **Positive** | 2,418 **Negative** (Balanced Dataset).
 - **Train-Test Split**:
   - **80% Training** | **20% Testing**.
-## 🏋️ Fine-Tuning Details
-We fine-tuned **`aubmindlab/bert-base-arabertv2`** using full fine-tuning (not LoRA).
-### **📊 Model Performance**
 | Metric       | Score  |
 |-------------|--------|
-| **Train Loss**| `0.470` |
-| **Eval Loss** | `0.373` |
-| **Accuracy**  | `86.41%` |
-| **Precision** | `87.01%` |
-| **Recall**    | `86.49%` |
-| **F1-score**  | `86.75%` |
----
-## ⚙️ Training Parameters
 ```python
-model_name = "aubmindlab/bert-base-arabertv2"
-model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2, classifier_dropout=0.5).to(device)
 training_args = TrainingArguments(
     output_dir="./results",
-    evaluation_strategy="epoch",
-    save_strategy="epoch",
-    per_device_train_batch_size=8,
-    per_device_eval_batch_size=8,
-    num_train_epochs=4,
-    weight_decay=1,
-    learning_rate=1e-5,
-    lr_scheduler_type="cosine",
-    warmup_ratio=0.1,
     fp16=True,
-    report_to="none",
     save_total_limit=2,
     gradient_accumulation_steps=2,
     load_best_model_at_end=True,
     max_grad_norm=1.0,
     metric_for_best_model="eval_loss",
     greater_is_better=False,
-)

+# Create a Markdown file with the enhanced model card content
+model_card_content = """\
 ---
 license: apache-2.0
 datasets:
 - precision
 - recall
 - f1
+base_model:
+- aubmindlab/bert-base-arabertv02
 pipeline_tag: text-classification
 tags:
 - arabic
+- sentiment-analysis
+- transformers
+- huggingface
+- bert
+- restaurants
+- fine-tuning
+- nlp
 ---
+# **🍽️ Arabic Restaurant Review Sentiment Analysis 🚀**
+## **📌 Overview**
+This **fine-tuned AraBERT model** classifies **Arabic restaurant reviews** as **Positive** or **Negative**.
+It is based on **aubmindlab/bert-base-arabertv2** and fine-tuned using **Hugging Face Transformers**.
+### **🔥 Why This Model?**
+✅ **Trained on Real Restaurant Reviews** from the **Hugging Face Dataset**.
+✅ **Fine-tuned with Full Training** (not LoRA or Adapters).
+✅ **Balanced Dataset** (2418 Positive vs. 2418 Negative Reviews).
+✅ **High Accuracy & Performance** for Sentiment Analysis in Arabic.
+---
+## **📥 Dataset & Preprocessing**
+- **Dataset Source**: [`hadyelsahar/ar_res_reviews`](https://huggingface.co/datasets/hadyelsahar/ar_res_reviews)
+- **Text Cleaning**:
+  - Removed **non-Arabic text**, special characters, and extra spaces.
+  - Normalized Arabic characters (`إ, أ, آ → ا`, `ة → ه`).
+  - Balanced **Positive & Negative** sentiment distribution.
 - **Tokenization**:
+  - Used **AraBERT tokenizer** (`aubmindlab/bert-base-arabertv2`).
 - **Train-Test Split**:
   - **80% Training** | **20% Testing**.
+---
+## **🏋️ Training & Performance**
+The model was fine-tuned using **Hugging Face Transformers** with the following hyperparameters:
+### **📊 Final Model Results**
 | Metric       | Score  |
 |-------------|--------|
+| **Train Loss** | `0.470` |
+| **Eval Loss**  | `0.373` |
+| **Accuracy**   | `86.41%` |
+| **Precision**  | `87.01%` |
+| **Recall**     | `86.49%` |
+| **F1-score**   | `86.75%` |
+### **⚙️ Training Configuration**
 ```python
 training_args = TrainingArguments(
     output_dir="./results",
+    evaluation_strategy="epoch",
+    save_strategy="epoch",
+    per_device_train_batch_size=8,
+    per_device_eval_batch_size=8,
+    num_train_epochs=4,
+    weight_decay=1,
+    learning_rate=1e-5,
+    lr_scheduler_type="cosine",
+    warmup_ratio=0.1,
     fp16=True,
     save_total_limit=2,
     gradient_accumulation_steps=2,
     load_best_model_at_end=True,
     max_grad_norm=1.0,
     metric_for_best_model="eval_loss",
     greater_is_better=False,
+)