Update README.md
Browse files
README.md
CHANGED
@@ -1,22 +1,113 @@
|
|
1 |
---
|
|
|
2 |
datasets:
|
3 |
-
-
|
4 |
-
language:
|
5 |
-
- en
|
6 |
-
metrics:
|
7 |
-
- accuracy
|
8 |
-
- f1
|
9 |
-
- precision
|
10 |
-
- recall
|
11 |
-
base_model:
|
12 |
-
- FacebookAI/roberta-base
|
13 |
library_name: transformers
|
|
|
|
|
14 |
tags:
|
15 |
-
-
|
16 |
-
-
|
17 |
-
-
|
18 |
-
-
|
19 |
-
-
|
20 |
-
-
|
21 |
-
|
22 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
+
license: apache-2.0
|
3 |
datasets:
|
4 |
+
- GoEmotions
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
5 |
library_name: transformers
|
6 |
+
language:
|
7 |
+
- en
|
8 |
tags:
|
9 |
+
- text-classification
|
10 |
+
- emotion-detection
|
11 |
+
- mental-health
|
12 |
+
- fine-tuned
|
13 |
+
model-index:
|
14 |
+
- name: Mental-Health-Chatbot-using-RoBERTa
|
15 |
+
results:
|
16 |
+
- task:
|
17 |
+
type: text-classification
|
18 |
+
dataset:
|
19 |
+
name: GoEmotions
|
20 |
+
type: emotions
|
21 |
+
metrics:
|
22 |
+
- name: AI2 Reasoning Challenge (25-Shot)
|
23 |
+
type: AI2 Reasoning Challenge (25-Shot)
|
24 |
+
value: 64.59
|
25 |
+
source:
|
26 |
+
name: Open LLM Leaderboard
|
27 |
+
url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard
|
28 |
+
---
|
29 |
+
|
30 |
+
# Mental Health Chatbot using RoBERTa (Fine-Tuned on GoEmotions)
|
31 |
+
|
32 |
+
## Model Description
|
33 |
+
|
34 |
+
This model is a fine-tuned version of **RoBERTa-base**, specifically designed to perform multi-label emotion classification. It has been trained on the **GoEmotions** dataset, a comprehensive dataset containing 28 emotion categories from Reddit comments. This model is optimized for applications requiring nuanced emotion analysis, such as mental health chatbots, sentiment analysis, and customer interaction systems.
|
35 |
+
|
36 |
+
Key Features:
|
37 |
+
- Multi-label emotion classification covering 28 fine-grained categories.
|
38 |
+
- Real-time inference capabilities for interactive applications.
|
39 |
+
- High accuracy on detecting nuanced emotions like gratitude, joy, and sadness.
|
40 |
+
|
41 |
+
---
|
42 |
+
|
43 |
+
## Repository
|
44 |
+
|
45 |
+
The full project, including the chatbot implementation and fine-tuning code, can be found at:
|
46 |
+
[GitHub Repository](https://github.com/kashyaparun/Mental-Health-Chatbot-using-RoBERTa-fine-tuned-on-GoEmotion)
|
47 |
+
|
48 |
+
---
|
49 |
+
|
50 |
+
## Applications
|
51 |
+
|
52 |
+
- **Mental Health Chatbots**: Understand user emotions and provide empathetic responses for emotional well-being.
|
53 |
+
- **Sentiment Analysis**: Analyze social media posts, reviews, and comments to gauge public sentiment.
|
54 |
+
- **Customer Support Systems**: Enhance customer interactions by detecting emotional states.
|
55 |
+
|
56 |
+
---
|
57 |
+
|
58 |
+
## Training and Evaluation
|
59 |
+
|
60 |
+
### Training Configuration
|
61 |
+
|
62 |
+
- **Base Model**: RoBERTa-base
|
63 |
+
- **Dataset**: [GoEmotions](https://www.kaggle.com/datasets/google/goemotions)
|
64 |
+
- **Batch Size**: 32
|
65 |
+
- **Optimizer**: AdamW
|
66 |
+
- **Learning Rate Scheduler**: Cosine Annealing
|
67 |
+
- **Loss Function**: Binary Cross-Entropy for multi-label classification
|
68 |
+
- **Epochs**: 5
|
69 |
+
|
70 |
+
### Evaluation Results
|
71 |
+
|
72 |
+
The model achieved the following performance metrics on the GoEmotions dataset:
|
73 |
+
|
74 |
+
| Metric | Value |
|
75 |
+
|---------------|--------|
|
76 |
+
| Macro F1-Score | 0.74 |
|
77 |
+
| ROC-AUC | 0.95 |
|
78 |
+
|
79 |
+
Additional Benchmark:
|
80 |
+
- **AI2 Reasoning Challenge (25-Shot)**: 64.59
|
81 |
+
[Source: Open LLM Leaderboard](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
|
82 |
+
|
83 |
+
---
|
84 |
+
|
85 |
+
## Model Files
|
86 |
+
|
87 |
+
The repository includes:
|
88 |
+
- **Tokenizer Configuration**: `tokenizer.json`, `tokenizer_config.json`, and `vocab.json`.
|
89 |
+
- **Model Weights**: `model_weights.pth`.
|
90 |
+
- **Special Tokens Map**: `special_tokens_map.json`.
|
91 |
+
|
92 |
+
These files are essential for reproducing the model or deploying it into other systems.
|
93 |
+
|
94 |
+
---
|
95 |
+
|
96 |
+
## How to Use
|
97 |
+
|
98 |
+
To load the model and tokenizer:
|
99 |
+
|
100 |
+
```python
|
101 |
+
from transformers import AutoTokenizer, AutoModelForSequenceClassification
|
102 |
+
|
103 |
+
# Load the tokenizer and model
|
104 |
+
tokenizer = AutoTokenizer.from_pretrained("kashyaparun/Mental-Health-Chatbot-using-RoBERTa")
|
105 |
+
model = AutoModelForSequenceClassification.from_pretrained("kashyaparun/Mental-Health-Chatbot-using-RoBERTa")
|
106 |
+
|
107 |
+
# Perform inference
|
108 |
+
text = "I'm feeling so joyful today!"
|
109 |
+
inputs = tokenizer(text, return_tensors="pt", padding=True, truncation=True)
|
110 |
+
outputs = model(**inputs)
|
111 |
+
|
112 |
+
# Emotion logits
|
113 |
+
print(outputs.logits)
|