Update README.md
Browse files
README.md
CHANGED
@@ -6,40 +6,40 @@ tags:
|
|
6 |
model-index:
|
7 |
- name: bert-mapa-german
|
8 |
results: []
|
|
|
|
|
9 |
---
|
10 |
|
11 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
12 |
-
should probably proofread and complete it, then remove this comment. -->
|
13 |
-
|
14 |
# bert-mapa-german
|
15 |
|
16 |
-
This model is a fine-tuned version of [google-bert/bert-base-german-cased](https://huggingface.co/google-bert/bert-base-german-cased) on the
|
17 |
-
It
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
18 |
- Loss: 0.0325
|
19 |
-
- Address: {'precision': 0.5882352941176471, 'recall': 0.6666666666666666, 'f1': 0.625, 'number': 15}
|
20 |
-
- Age: {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 3}
|
21 |
-
- Amount: {'precision': 1.0, 'recall': 1.0, 'f1': 1.0, 'number': 1}
|
22 |
-
- Date: {'precision': 0.9454545454545454, 'recall': 0.9454545454545454, 'f1': 0.9454545454545454, 'number': 55}
|
23 |
-
- Name: {'precision': 0.7, 'recall': 0.9545454545454546, 'f1': 0.8076923076923077, 'number': 22}
|
24 |
-
- Organisation: {'precision': 0.5405405405405406, 'recall': 0.6451612903225806, 'f1': 0.588235294117647, 'number': 31}
|
25 |
-
- Person: {'precision': 0.5384615384615384, 'recall': 0.5, 'f1': 0.5185185185185186, 'number': 14}
|
26 |
-
- Role: {'precision': 0.0, 'recall': 0.0, 'f1': 0.0, 'number': 1}
|
27 |
-
- Overall Precision: 0.7255
|
28 |
-
- Overall Recall: 0.7817
|
29 |
-
- Overall F1: 0.7525
|
30 |
- Overall Accuracy: 0.9912
|
31 |
|
32 |
-
## Model description
|
33 |
-
|
34 |
-
More information needed
|
35 |
|
36 |
## Intended uses & limitations
|
37 |
|
38 |
-
|
39 |
|
40 |
## Training and evaluation data
|
41 |
|
42 |
-
|
43 |
|
44 |
## Training procedure
|
45 |
|
@@ -56,12 +56,12 @@ The following hyperparameters were used during training:
|
|
56 |
|
57 |
### Training results
|
58 |
|
59 |
-
| Training Loss | Epoch | Step | Validation Loss |
|
60 |
-
|
61 |
-
| No log | 1.0 | 218 | 0.0607 |
|
62 |
-
| No log | 2.0 | 436 | 0.0479 |
|
63 |
-
| 0.116 | 3.0 | 654 | 0.0414 |
|
64 |
-
| 0.116 | 4.0 | 872 | 0.0421 |
|
65 |
|
66 |
|
67 |
### Framework versions
|
@@ -69,4 +69,4 @@ The following hyperparameters were used during training:
|
|
69 |
- Transformers 4.40.0
|
70 |
- Pytorch 2.1.0+cu121
|
71 |
- Datasets 2.19.0
|
72 |
-
- Tokenizers 0.19.1
|
|
|
6 |
model-index:
|
7 |
- name: bert-mapa-german
|
8 |
results: []
|
9 |
+
language:
|
10 |
+
- de
|
11 |
---
|
12 |
|
|
|
|
|
|
|
13 |
# bert-mapa-german
|
14 |
|
15 |
+
This model is a fine-tuned version of [google-bert/bert-base-german-cased](https://huggingface.co/google-bert/bert-base-german-cased) on the MAPA german dataset.
|
16 |
+
It's purpose is to discern private information within German texts.
|
17 |
+
|
18 |
+
It achieves the following results on the test set:
|
19 |
+
|
20 |
+
| Category | Precision | Recall | F1 | Number |
|
21 |
+
|---------------|------------|------------|------------|--------|
|
22 |
+
| Address | 0.5882 | 0.6667 | 0.625 | 15 |
|
23 |
+
| Age | 0.0 | 0.0 | 0.0 | 3 |
|
24 |
+
| Amount | 1.0 | 1.0 | 1.0 | 1 |
|
25 |
+
| Date | 0.9455 | 0.9455 | 0.9455 | 55 |
|
26 |
+
| Name | 0.7 | 0.9545 | 0.8077 | 22 |
|
27 |
+
| Organisation | 0.5405 | 0.6452 | 0.5882 | 31 |
|
28 |
+
| Person | 0.5385 | 0.5 | 0.5185 | 14 |
|
29 |
+
| Role | 0.0 | 0.0 | 0.0 | 1 |
|
30 |
+
| Overall | 0.7255 | 0.7817 | 0.7525 | |
|
31 |
+
|
32 |
- Loss: 0.0325
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
33 |
- Overall Accuracy: 0.9912
|
34 |
|
|
|
|
|
|
|
35 |
|
36 |
## Intended uses & limitations
|
37 |
|
38 |
+
This model is engineered for the purpose of discerning private information within German texts. Its training corpus comprises only 1744 example sentences, thereby leading to a higher frequency of errors in its predictions.
|
39 |
|
40 |
## Training and evaluation data
|
41 |
|
42 |
+
Random split of the MAPA german dataset into 80% train, 10% valdiation and 10% test.
|
43 |
|
44 |
## Training procedure
|
45 |
|
|
|
56 |
|
57 |
### Training results
|
58 |
|
59 |
+
| Training Loss | Epoch | Step | Validation Loss | Overall Precision | Overall Recall | Overall F1 | Overall Accuracy |
|
60 |
+
|:-------------:|:-----:|:----:|:---------------:|:-----------------:|:--------------:|:----------:|:----------------:|
|
61 |
+
| No log | 1.0 | 218 | 0.0607 | 0.6527 | 0.7786 | 0.7101 | 0.9859 |
|
62 |
+
| No log | 2.0 | 436 | 0.0479 | 0.7355 | 0.8143 | 0.7729 | 0.9896 |
|
63 |
+
| 0.116 | 3.0 | 654 | 0.0414 | 0.7712 | 0.8429 | 0.8055 | 0.9908 |
|
64 |
+
| 0.116 | 4.0 | 872 | 0.0421 | 0.7857 | 0.8643 | 0.8231 | 0.9917 |
|
65 |
|
66 |
|
67 |
### Framework versions
|
|
|
69 |
- Transformers 4.40.0
|
70 |
- Pytorch 2.1.0+cu121
|
71 |
- Datasets 2.19.0
|
72 |
+
- Tokenizers 0.19.1
|