Token Classification
GLiNER
PyTorch
ONNX
English
MikeG27 commited on
Commit
e6282f2
·
verified ·
1 Parent(s): 8db708c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +21 -4
README.md CHANGED
@@ -10,8 +10,28 @@ base_model:
10
  - urchade/gliner_small-v2.1
11
  ---
12
 
13
- # GliNER PII Detection 🚀
 
 
14
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
15
 
16
  ## Installation
17
  To use this model, you must install the GLiNER Python library:
@@ -34,7 +54,6 @@ model = GLiNER.from_pretrained(
34
  text = """
35
  Hey, just a quick update — I talked to David yesterday.
36
  He sent over the files from his private email ([email protected]), and we should be careful with his SSN: 123-45-6789.
37
- Also, please don't push the GitHub repo until we remove the API key: ghp_abcdEfgh1234567890.
38
  He mentioned his new address is 123 Maple Street in New York.
39
  His PC adress is 192.168.1.100.
40
  """
@@ -42,7 +61,6 @@ His PC adress is 192.168.1.100.
42
  labels = ["name",
43
  "email",
44
  "ssn",
45
- "api_key",
46
  "street_address",
47
  "date",
48
  "ipv4"]
@@ -58,7 +76,6 @@ David => name => 0.9066112041473389
58
  yesterday => date => 0.9482080340385437
59
  [email protected] => email => 0.9911587834358215
60
  123-45-6789 => ssn => 0.8612598180770874
61
- ghp_abcdEfgh1234567890 => api_key => 0.6934646964073181
62
  123 Maple Street in New York => street_address => 0.9869663715362549
63
  192.168.1.100 => ipv4 => 0.9810121059417725
64
  ```
 
10
  - urchade/gliner_small-v2.1
11
  ---
12
 
13
+ # Gravitee GliNER PII Detection 🚀
14
+ GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.
15
+ The list of entities is provided in Evaluation results; however, due to the model's nature, it has the capability to identify other entity types as well.
16
 
17
+ Evaluation results:
18
+
19
+ | Entity | Precision | Recall | F1 Score |
20
+ |---------------------------|-----------|-----------|-----------|
21
+ | email | 0.937213 | 0.925870 | 0.931507 |
22
+ | phone_number | 0.898515 | 0.876812 | 0.887531 |
23
+ | name | 0.929052 | 0.824776 | 0.873814 |
24
+ | date_of_birth | 0.813953 | 0.937500 | 0.871369 |
25
+ | date | 0.888942 | 0.839801 | 0.863673 |
26
+ | location | 0.881579 | 0.829833 | 0.854924 |
27
+ | company | 0.821222 | 0.873162 | 0.846396 |
28
+ | ipv4 | 0.791667 | 0.890625 | 0.838235 |
29
+ | ssn | 0.897959 | 0.785714 | 0.838095 |
30
+ | bank_routing_number | 0.898305 | 0.746479 | 0.815385 |
31
+ | driver_license_number | 0.918367 | 0.725806 | 0.810811 |
32
+ | passport_number | 0.918367 | 0.714286 | 0.803571 |
33
+ | credit_card_security_code | 0.830986 | 0.756410 | 0.791946 |
34
+ | time | 0.834297 | 0.674455 | 0.745909 |
35
 
36
  ## Installation
37
  To use this model, you must install the GLiNER Python library:
 
54
  text = """
55
  Hey, just a quick update — I talked to David yesterday.
56
  He sent over the files from his private email ([email protected]), and we should be careful with his SSN: 123-45-6789.
 
57
  He mentioned his new address is 123 Maple Street in New York.
58
  His PC adress is 192.168.1.100.
59
  """
 
61
  labels = ["name",
62
  "email",
63
  "ssn",
 
64
  "street_address",
65
  "date",
66
  "ipv4"]
 
76
  yesterday => date => 0.9482080340385437
77
  [email protected] => email => 0.9911587834358215
78
  123-45-6789 => ssn => 0.8612598180770874
 
79
  123 Maple Street in New York => street_address => 0.9869663715362549
80
  192.168.1.100 => ipv4 => 0.9810121059417725
81
  ```