Indonesian Text Sentiment Analysis πŸš€

πŸ“Œ Overview

This project fine-tunes a transformer-based model to analyze sentiment for Indonesian text.

πŸ“₯ Data Collection

The dataset used for fine-tuning was sourced from IndoNLU Datasets, specifically:
SmSA (IndoNLU) Dataset

πŸ”„ Data Preparation

  • Tokenization:
    • Used Indobert for efficient text processing.
  • Train-Test Split:
    • The Dataset is already splitted into train, validation, and test.

πŸ‹οΈ Fine-Tuning & Results

The model was fine-tuned using TensorFlow Hugging Face Transformers.

πŸ“Š Evaluation Metrics

Epoch Train Loss Train Accuracy Eval Loss Eval Accuracy Training Time Validation Time
1 0.2471 88.15% 0.2107 91.31% 7:55 min 10 sec
2 0.1844 90.41% 0.2107 92.39% 7:50 min 10 sec
3 0.1502 91.66% 0.2135 93.14% 7:51 min 9 sec
4 0.1285 92.50% 0.2192 93.69% 7:50 min 10 sec
5 0.1101 93.13% 0.2367 94.14% 7:48 min 9 sec

βš™οΈ Training Parameters

epochs = 5

learning_rate = 5e-5

seed_val = 42

max_length = 128

batch_size = 32

eval_batch_size = 32

πŸ€– How to use

import tensorflow as tf
from transformers import TFAutoModelForSequenceClassification, AutoTokenizer

# Load model dan tokenizer
model_name = "feverlash/Indonesian-SentimentAnalysis-Model"  # Ganti dengan path model yang telah disimpan
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = TFAutoModelForSequenceClassification.from_pretrained(model_name)

# Fungsi untuk melakukan prediksi sentimen
def predict(text):
    sentiment_mapping = {
        1: "positive",
        0: "negative",
        2: "neutral"
    }
    
    # Tokenisasi teks
    inputs = tokenizer(
        text,
        return_tensors="tf",
        truncation=True,
        padding="max_length",
        max_length=128
    )
    
    # Prediksi menggunakan model
    outputs = model(inputs)
    logits = outputs.logits

    # Menghitung probabilitas
    probabilities = tf.nn.softmax(logits).numpy()
    
    # Menentukan label prediksi
    predicted_index = int(tf.argmax(probabilities, axis=1).numpy()[0])
    predicted_label = sentiment_mapping.get(predicted_index, "unknown")
    
    # Keyakinan prediksi
    confidence = probabilities[0][predicted_index]

    print(f"Teks: {text}")
    print(f"Prediksi label: {predicted_label} (Confidence: {confidence:.2f})")

# Contoh penggunaan
text = "aku sedang jalan-jalan di Yogyakarta"
predict(text)
Downloads last month
119
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The model has no library tag.

Model tree for feverlash/Indonesian-SentimentAnalysis-Model

Finetuned
(34)
this model

Dataset used to train feverlash/Indonesian-SentimentAnalysis-Model