dougtrajano commited on
Commit
329970e
·
1 Parent(s): 34e989e

update model card README.md

Browse files
Files changed (1) hide show
  1. README.md +27 -55
README.md CHANGED
@@ -10,73 +10,38 @@ metrics:
10
  model-index:
11
  - name: toxicity-target-type-identification
12
  results: []
13
- datasets:
14
- - dougtrajano/olid-br
15
- language:
16
- - pt
17
- library_name: transformers
18
  ---
19
 
20
- # toxicity-target-type-identification
21
-
22
- Toxicity Target Type Identification is a model that classifies the type (individual, group, or other) of a given targeted text.
23
-
24
- This BERT model is a fine-tuned version of [neuralmind/bert-base-portuguese-cased](https://huggingface.co/neuralmind/bert-base-portuguese-cased) on the [OLID-BR dataset](https://huggingface.co/datasets/dougtrajano/olid-br).
25
-
26
- ## Overview
27
-
28
- **Input:** Text in Brazilian Portuguese
29
-
30
- **Output:** Multiclass classification (individual, group, or other)
31
-
32
- ## Usage
33
-
34
- ```python
35
- from transformers import AutoTokenizer, AutoModelForSequenceClassification
36
-
37
- tokenizer = AutoTokenizer.from_pretrained("dougtrajano/toxicity-target-type-identification")
38
-
39
- model = AutoModelForSequenceClassification.from_pretrained("dougtrajano/toxicity-target-type-identification")
40
- ```
41
-
42
- ## Limitations and bias
43
-
44
- The following factors may degrade the model’s performance.
45
 
46
- **Text Language**: The model was trained on Brazilian Portuguese texts, so it may not work well with Portuguese dialects.
47
-
48
- **Text Origin**: The model was trained on texts from social media and a few texts from other sources, so it may not work well on other types of texts.
49
-
50
- ## Trade-offs
51
-
52
- Sometimes models exhibit performance issues under particular circumstances. In this section, we'll discuss situations in which you might discover that the model performs less than optimally, and should plan accordingly.
53
-
54
- **Text Length**: The model was fine-tuned on texts with a word count between 1 and 178 words (average of 18 words). It may give poor results on texts with a word count outside this range.
55
 
56
- ## Performance
 
 
 
 
 
 
57
 
58
- The model was evaluated on the test set of the [OLID-BR](https://dougtrajano.github.io/olid-br/) dataset.
59
 
60
- **Accuracy:** 0.7505
61
 
62
- **Precision:** 0.7812
63
 
64
- **Recall:** 0.7505
65
 
66
- **F1-Score:** 0.7603
67
 
68
- | Class | Precision | Recall | F1-Score | Support |
69
- | :---: | :-------: | :----: | :------: | :-----: |
70
- | `INDIVIDUAL` | 0.8850 | 0.7964 | 0.8384 | 609 |
71
- | `GROUP` | 0.6766 | 0.6385 | 0.6570 | 213 |
72
- | `OTHER` | 0.4518 | 0.7177 | 0.5545 | 124 |
73
 
74
  ## Training procedure
75
 
76
  ### Training hyperparameters
77
 
78
  The following hyperparameters were used during training:
79
-
80
  - learning_rate: 3.952388499692274e-05
81
  - train_batch_size: 8
82
  - eval_batch_size: 8
@@ -85,13 +50,20 @@ The following hyperparameters were used during training:
85
  - lr_scheduler_type: linear
86
  - num_epochs: 30
87
 
 
 
 
 
 
 
 
 
 
 
 
88
  ### Framework versions
89
 
90
  - Transformers 4.26.1
91
  - Pytorch 1.10.2+cu113
92
  - Datasets 2.9.0
93
  - Tokenizers 0.13.2
94
-
95
- ## Provide Feedback
96
-
97
- If you have any feedback on this model, please [open an issue](https://github.com/DougTrajano/ToChiquinho/issues/new) on GitHub.
 
10
  model-index:
11
  - name: toxicity-target-type-identification
12
  results: []
 
 
 
 
 
13
  ---
14
 
15
+ <!-- This model card has been generated automatically according to the information the Trainer had access to. You
16
+ should probably proofread and complete it, then remove this comment. -->
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
 
18
+ # toxicity-target-type-identification
 
 
 
 
 
 
 
 
19
 
20
+ This model is a fine-tuned version of [neuralmind/bert-large-portuguese-cased](https://huggingface.co/neuralmind/bert-large-portuguese-cased) on the None dataset.
21
+ It achieves the following results on the evaluation set:
22
+ - Loss: 1.4281
23
+ - Accuracy: 0.8002
24
+ - F1: 0.7986
25
+ - Precision: 0.7990
26
+ - Recall: 0.8002
27
 
28
+ ## Model description
29
 
30
+ More information needed
31
 
32
+ ## Intended uses & limitations
33
 
34
+ More information needed
35
 
36
+ ## Training and evaluation data
37
 
38
+ More information needed
 
 
 
 
39
 
40
  ## Training procedure
41
 
42
  ### Training hyperparameters
43
 
44
  The following hyperparameters were used during training:
 
45
  - learning_rate: 3.952388499692274e-05
46
  - train_batch_size: 8
47
  - eval_batch_size: 8
 
50
  - lr_scheduler_type: linear
51
  - num_epochs: 30
52
 
53
+ ### Training results
54
+
55
+ | Training Loss | Epoch | Step | Validation Loss | Accuracy | F1 | Precision | Recall |
56
+ |:-------------:|:-----:|:----:|:---------------:|:--------:|:------:|:---------:|:------:|
57
+ | No log | 1.0 | 355 | 0.7145 | 0.6903 | 0.7052 | 0.7528 | 0.6903 |
58
+ | 0.8011 | 2.0 | 710 | 0.9930 | 0.7928 | 0.7840 | 0.7835 | 0.7928 |
59
+ | 0.529 | 3.0 | 1065 | 1.4281 | 0.8002 | 0.7986 | 0.7990 | 0.8002 |
60
+ | 0.529 | 4.0 | 1420 | 1.6783 | 0.7727 | 0.7753 | 0.7788 | 0.7727 |
61
+ | 0.2706 | 5.0 | 1775 | 2.3904 | 0.7727 | 0.7683 | 0.7660 | 0.7727 |
62
+
63
+
64
  ### Framework versions
65
 
66
  - Transformers 4.26.1
67
  - Pytorch 1.10.2+cu113
68
  - Datasets 2.9.0
69
  - Tokenizers 0.13.2