Indonesian Text Sentiment Analysis π
π Overview
This project fine-tunes a transformer-based model to analyze sentiment for Indonesian text.
π₯ Data Collection
The dataset used for fine-tuning was sourced from IndoNLU Datasets, specifically:
SmSA (IndoNLU) Dataset
π Data Preparation
- Tokenization:
- Used Indobert for efficient text processing.
- Train-Test Split:
- The Dataset is already splitted into train, validation, and test.
ποΈ Fine-Tuning & Results
The model was fine-tuned using TensorFlow Hugging Face Transformers.
π Evaluation Metrics
Epoch | Train Loss | Train Accuracy | Eval Loss | Eval Accuracy | Training Time | Validation Time |
---|---|---|---|---|---|---|
1 | 0.2471 |
88.15% |
0.2107 |
91.31% |
7:55 min |
10 sec |
2 | 0.1844 |
90.41% |
0.2107 |
92.39% |
7:50 min |
10 sec |
3 | 0.1502 |
91.66% |
0.2135 |
93.14% |
7:51 min |
9 sec |
4 | 0.1285 |
92.50% |
0.2192 |
93.69% |
7:50 min |
10 sec |
5 | 0.1101 |
93.13% |
0.2367 |
94.14% |
7:48 min |
9 sec |
βοΈ Training Parameters
epochs = 5
learning_rate = 5e-5
seed_val = 42
max_length = 128
batch_size = 32
eval_batch_size = 32
π€ How to use
import tensorflow as tf
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer
# Load model dan tokenizer
model_name = "feverlash/Indonesian-SentimentAnalysis-Model" # Ganti dengan path model yang telah disimpan
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name)
# Fungsi untuk melakukan prediksi sentimen
def predict(text):
sentiment_mapping = {
1: "positive",
0: "negative",
2: "neutral"
}
# Tokenisasi teks
inputs = tokenizer(
text,
return_tensors="tf",
truncation=True,
padding="max_length",
max_length=128
)
# Prediksi menggunakan model
outputs = model(inputs)
logits = outputs.logits
# Menghitung probabilitas
probabilities = tf.nn.softmax(logits).numpy()
# Menentukan label prediksi
predicted_index = int(tf.argmax(probabilities, axis=1).numpy()[0])
predicted_label = sentiment_mapping.get(predicted_index, "unknown")
# Keyakinan prediksi
confidence = probabilities[0][predicted_index]
print(f"Teks: {text}")
print(f"Prediksi label: {predicted_label} (Confidence: {confidence:.2f})")
# Contoh penggunaan
text = "aku sedang jalan-jalan di Yogyakarta"
predict(text)
- Downloads last month
- 119
Inference Providers
NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API:
The model has no library tag.
Model tree for feverlash/Indonesian-SentimentAnalysis-Model
Base model
indobenchmark/indobert-base-p1