File size: 2,958 Bytes
ea83c14 ca155e5 ea83c14 ca155e5 ea83c14 ca155e5 ea83c14 ca155e5 ea83c14 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 |
---
tags:
- text-classification
- multi-label
- go-emotions
- transformers
- huggingface
license: apache-2.0
library_name: transformers
language:
- en
metrics:
- accuracy
- f1
base_model:
- google-bert/bert-base-uncased
pipeline_tag: text-classification
---
# π₯ Fine-Tuned BERT on GoEmotions Dataset
## π Model Overview
This model is a **fine-tuned version of BERT** (`bert-base-uncased`) on the **GoEmotions** dataset for **multi-label emotion classification**. It can predict multiple emotions per input text.
## π Performance
| Metric | Score |
|----------------|-------|
| **Accuracy** | 46.57% |
| **F1 Score** | 56.41% |
| **Hamming Loss** | 3.39% |
## π Model Usage
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
# Load model and tokenizer
model_name = "codewithdark/bert-Gomotions"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)
# Emotion labels (adjust based on your dataset)
emotion_labels = [
"Admiration", "Amusement", "Anger", "Annoyance", "Approval", "Caring", "Confusion",
"Curiosity", "Desire", "Disappointment", "Disapproval", "Disgust", "Embarrassment",
"Excitement", "Fear", "Gratitude", "Grief", "Joy", "Love", "Nervousness", "Optimism",
"Pride", "Realization", "Relief", "Remorse", "Sadness", "Surprise", "Neutral"
]
# Example text
text = "I'm so happy today!"
inputs = tokenizer(text, return_tensors="pt")
# Predict
with torch.no_grad():
outputs = model(**inputs)
probs = torch.sigmoid(outputs.logits).squeeze(0) # Convert logits to probabilities
# Get top 5 predictions
top5_indices = torch.argsort(probs, descending=True)[:5] # Get indices of top 5 labels
top5_labels = [emotion_labels[i] for i in top5_indices]
top5_probs = [probs[i].item() for i in top5_indices]
# Print results
print("Top 5 Predicted Emotions:")
for label, prob in zip(top5_labels, top5_probs):
print(f"{label}: {prob:.4f}")
'''
output:
Top 5 Predicted Emotions:
Joy: 0.9478
Love: 0.7854
Optimism: 0.6342
Admiration: 0.5678
Excitement: 0.5231
'''
```
## ποΈββοΈ Training Details
- **Model:** `bert-base-uncased`
- **Dataset:** [GoEmotions](https://huggingface.co/datasets/go_emotions)
- **Optimizer:** AdamW
- **Loss Function:** BCEWithLogitsLoss (Binary Cross-Entropy for multi-label classification)
- **Batch Size:** 16
- **Epochs:** 3
- **Evaluation Metrics:** Accuracy, F1 Score, Hamming Loss
## π How to Use in Hugging Face
```python
from transformers import pipeline
classifier = pipeline("text-classification", model="codewithdark/bert-Gomotions", top_k=None)
classifier("I'm so excited about the trip!")
```
## π οΈ Citation
If you use this model, please cite:
```bibtex
@misc{your_model,
author = {codewithdark},
title = {Fine-tuned BERT on GoEmotions},
year = {2025},
url = {https://huggingface.co/codewithdark/bert-Gomotions}
}
``` |