Sentiment Analysis with Fine-tuned Multilingual BERT for Georgian ๐ฌ๐ช
๐ Model Overview
This is a fine-tuned BERT model for Georgian sentiment analysis, based on bert-base-multilingual-cased
. The model was trained using the Georgian Sentiment Analysis dataset.
- Base Model:
bert-base-multilingual-cased
- Fine-tuned on:
Arseniy-Sandalov/Georgian-Sentiment-Analysis
- Task: Sentiment classification (positive, negative, neutral)
- Tokenizer: BERT multilingual cased tokenizer
- License: Check dataset source
๐ Usage Example
You can load and use this model with Hugging Face Transformers:
from transformers import AutoModelForSequenceClassification, AutoTokenizer
import torch
model_name = "Arseniy-Sandalov/GeorgianBert-Sent"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
def predict_sentiment(text):
inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
prediction = torch.argmax(outputs.logits, dim=1).item()
return ["negative", "neutral", "positive"][prediction]
text = "แแฎแแแ แแแแ แ แแแ แแแ แแ แแแแ"
print(predict_sentiment(text))
๐ Training Details
Dataset Preprocessing:
Removed irrelevant columns (e.g., perturbation)
Stratified split: 80% train, 10% validation, 10% test
Evaluation Metric:
- ROC AUC Score (computed on validation & test sets)
๐ Citation
If you use this model, please cite the original dataset:
@misc {Stefanovitch2023Sentiment,
author = {Stefanovitch, Nicolas and Piskorski, Jakub and Kharazi, Sopho},
title = {Sentiment analysis for Georgian},
year = {2023},
publisher = {European Commission, Joint Research Centre (JRC)},
howpublished = {\url{http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}},
url = {http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf},
type = {dataset},
note = {PID: http://data.europa.eu/89h/9f04066a-8cc0-4669-99b4-f1f0627fdbbf}
}
- Downloads last month
- 0
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.
Model tree for Arseniy-Sandalov/GeorgianBert-Sent
Base model
google-bert/bert-base-multilingual-cased