AraBERT-Restaurant-Sentiment

This repository contains the fine-tuned model moazx/AraBERT-Restaurant-Sentiment for classifying Arabic restaurant reviews into positive and negative sentiments. The model is based on the AraBERT architecture and trained on a dataset of 800 Arabic restaurant reviews, collected and labeled using ChatGPT.

Model Description

The AraBERT-Restaurant-Sentiment model is fine-tuned to classify Arabic restaurant reviews into two categories:

  • Positive
  • Negative

The dataset used for training consists of 400 positive and 400 negative reviews, covering multiple Arabic dialects.

Sample Output

Below are some examples of the model's output:

Sample Output

Below are some examples of the model's output:

Example 1

Input: المطعم ما عجبني، الطعم مو حلو والخدمة كانت سيئة جداً، والموظفين ما كانوا محترمين. الأسعار غالية مقارنة بالجودة. ما بنصح فيه.

Expected Classification: سلبي

Predicted Classification: سلبي

Probability (Negative): 0.98

Probability (Positive): 0.02

Example 2

Input: المطعم يجنن والاكل تحفة

Expected Classification: إيجابي

Predicted Classification: إيجابي

Probability (Negative): 0.01

Probability (Positive): 0.99

Training and Dataset

The model was trained using the notebook available at Kaggle. The training process involves the following steps:

  1. Data Collection: 800 reviews (400 positive, 400 negative) were collected and labeled using ChatGPT.
  2. Preprocessing: Text normalization, tokenization, and other preprocessing steps were applied.
  3. Fine-Tuning: The AraBERT model was fine-tuned on the prepared dataset.
  4. Evaluation: The model was evaluated using accuracy and other relevant metrics to ensure its performance.

Usage

To use the moazx/AraBERT-Restaurant-Sentiment model, you can load it using the Hugging Face transformers library. Below is an example of how to use the model to classify a restaurant review:

from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# Load the tokenizer and model
tokenizer = AutoTokenizer.from_pretrained("moazx/AraBERT-Restaurant-Sentiment")
model = AutoModelForSequenceClassification.from_pretrained("moazx/AraBERT-Restaurant-Sentiment")

# Encode the input text
input_text = "المطعم يجنن والاكل تحفة"
inputs = tokenizer(input_text, return_tensors="pt")

# Get the model predictions
outputs = model(**inputs)
probabilities = torch.nn.functional.softmax(outputs.logits, dim=-1)

# Print the results
negative_prob = probabilities[0][0].item()
positive_prob = probabilities[0][1].item()
predicted_class = "إيجابي" if positive_prob > negative_prob else "سلبي"

print(f"التصنيف المتوقع: {predicted_class}")
print(f"احتمالية سلبي: {negative_prob:.2f}")
print(f"احتمالية إيجابي: {positive_prob:.2f}")

Acknowledgements

The dataset used for training the model was collected and labeled using ChatGPT. Special thanks to the creators of AraBERT and the Hugging Face team for their continuous support and development of open-source NLP tools.

For more details on the training process and dataset, please refer to the Kaggle Notebook.

Downloads last month
117
Safetensors
Model size
135M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Space using moazx/AraBERT-Restaurant-Sentiment 1