alireza-2003 commited on
Commit
245334a
·
verified ·
1 Parent(s): ccc4ba2

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -9
README.md CHANGED
@@ -14,16 +14,39 @@ pipeline_tag: text-classification
14
 
15
  This project fine-tunes a BERT model to classify Persian comments into two categories: complaints about Product discrepancy (`True`) and not (`False`). The model is trained on the [Basalam Comments](https://www.kaggle.com/datasets/alirezaazizkhani/labeled-persian-comments) dataset.
16
 
17
- ## Key Metrics
 
 
 
 
 
 
18
 
19
- - **Accuracy**: 95.89%
20
- - **F1 Score**: 95.62%
21
 
 
 
22
 
23
- ## Code
24
- The code for fine-tuning the model is available on [Kaggle](https://www.kaggle.com/code/alirezaazizkhani/finetune-bert-for-discrepancy/).
 
25
 
26
- ## Installation
27
- To run the code, install the necessary dependencies:
28
- ```bash
29
- pip install transformers datasets scikit-learn matplotlib seaborn
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
  This project fine-tunes a BERT model to classify Persian comments into two categories: complaints about Product discrepancy (`True`) and not (`False`). The model is trained on the [Basalam Comments](https://www.kaggle.com/datasets/alirezaazizkhani/labeled-persian-comments) dataset.
16
 
17
+ ## 🛠 Training Details
18
+ - **Base Model**: `HooshvareLab/bert-fa-base-uncased`
19
+ - **Fine-Tuning Dataset**: Basalam comments
20
+ - **[NoteBook](https://www.kaggle.com/code/alirezaazizkhani/finetune-bert-for-discrepancy)**
21
+ - **Evaluation Metrics**:
22
+ - **Accuracy**: 95.89%
23
+ - **F1 Score**: 95.62%
24
 
 
 
25
 
26
+ ## 📥 How to Use
27
+ You can load and use the fine-tuned model as follows:
28
 
29
+ ```python
30
+ from transformers import AutoModelForSequenceClassification, AutoTokenizer
31
+ import torch
32
 
33
+ def classify_comment(text):
34
+ model_name = "alireza-2003/bert-fa-discrepancy-detection"
35
+ model = AutoModelForSequenceClassification.from_pretrained(model_name)
36
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
37
+
38
+ inputs = tokenizer(text, return_tensors="pt", truncation=True, padding=True)
39
+ with torch.no_grad():
40
+ outputs = model(**inputs)
41
+ prediction = torch.argmax(outputs.logits).item()
42
+
43
+ return "Discrepancy Complaint" if prediction == 1 else "Not a Complaint"
44
+
45
+ comment = "دو تا سفارش داده بودم یدونه ابی و یدونه قرمز ولی هردوتاش قرمز بود"
46
+ print(classify_comment(comment))
47
+ ```
48
+
49
+ ---
50
+ 📝 **Author**: [Alireza]
51
+ 📅 **Last Updated**: [2/16/2025]
52
+ 🔗 **Dataset**: [Kaggle Dataset](https://www.kaggle.com/datasets/alirezaazizkhani/labeled-persian-comments)