Update README.md
Browse files
README.md
CHANGED
|
@@ -10,8 +10,28 @@ base_model:
|
|
| 10 |
- urchade/gliner_small-v2.1
|
| 11 |
---
|
| 12 |
|
| 13 |
-
# GliNER PII Detection 🚀
|
|
|
|
|
|
|
| 14 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
## Installation
|
| 17 |
To use this model, you must install the GLiNER Python library:
|
|
@@ -34,7 +54,6 @@ model = GLiNER.from_pretrained(
|
|
| 34 |
text = """
|
| 35 |
Hey, just a quick update — I talked to David yesterday.
|
| 36 |
He sent over the files from his private email ([email protected]), and we should be careful with his SSN: 123-45-6789.
|
| 37 |
-
Also, please don't push the GitHub repo until we remove the API key: ghp_abcdEfgh1234567890.
|
| 38 |
He mentioned his new address is 123 Maple Street in New York.
|
| 39 |
His PC adress is 192.168.1.100.
|
| 40 |
"""
|
|
@@ -42,7 +61,6 @@ His PC adress is 192.168.1.100.
|
|
| 42 |
labels = ["name",
|
| 43 |
"email",
|
| 44 |
"ssn",
|
| 45 |
-
"api_key",
|
| 46 |
"street_address",
|
| 47 |
"date",
|
| 48 |
"ipv4"]
|
|
@@ -58,7 +76,6 @@ David => name => 0.9066112041473389
|
|
| 58 |
yesterday => date => 0.9482080340385437
|
| 59 |
[email protected] => email => 0.9911587834358215
|
| 60 |
123-45-6789 => ssn => 0.8612598180770874
|
| 61 |
-
ghp_abcdEfgh1234567890 => api_key => 0.6934646964073181
|
| 62 |
123 Maple Street in New York => street_address => 0.9869663715362549
|
| 63 |
192.168.1.100 => ipv4 => 0.9810121059417725
|
| 64 |
```
|
|
|
|
| 10 |
- urchade/gliner_small-v2.1
|
| 11 |
---
|
| 12 |
|
| 13 |
+
# Gravitee GliNER PII Detection 🚀
|
| 14 |
+
GLiNER is a Named Entity Recognition (NER) model capable of identifying any entity type using a bidirectional transformer encoder (BERT-like). It provides a practical alternative to traditional NER models, which are limited to predefined entities, and Large Language Models (LLMs) that, despite their flexibility, are costly and large for resource-constrained scenarios.
|
| 15 |
+
The list of entities is provided in Evaluation results; however, due to the model's nature, it has the capability to identify other entity types as well.
|
| 16 |
|
| 17 |
+
Evaluation results:
|
| 18 |
+
|
| 19 |
+
| Entity | Precision | Recall | F1 Score |
|
| 20 |
+
|---------------------------|-----------|-----------|-----------|
|
| 21 |
+
| email | 0.937213 | 0.925870 | 0.931507 |
|
| 22 |
+
| phone_number | 0.898515 | 0.876812 | 0.887531 |
|
| 23 |
+
| name | 0.929052 | 0.824776 | 0.873814 |
|
| 24 |
+
| date_of_birth | 0.813953 | 0.937500 | 0.871369 |
|
| 25 |
+
| date | 0.888942 | 0.839801 | 0.863673 |
|
| 26 |
+
| location | 0.881579 | 0.829833 | 0.854924 |
|
| 27 |
+
| company | 0.821222 | 0.873162 | 0.846396 |
|
| 28 |
+
| ipv4 | 0.791667 | 0.890625 | 0.838235 |
|
| 29 |
+
| ssn | 0.897959 | 0.785714 | 0.838095 |
|
| 30 |
+
| bank_routing_number | 0.898305 | 0.746479 | 0.815385 |
|
| 31 |
+
| driver_license_number | 0.918367 | 0.725806 | 0.810811 |
|
| 32 |
+
| passport_number | 0.918367 | 0.714286 | 0.803571 |
|
| 33 |
+
| credit_card_security_code | 0.830986 | 0.756410 | 0.791946 |
|
| 34 |
+
| time | 0.834297 | 0.674455 | 0.745909 |
|
| 35 |
|
| 36 |
## Installation
|
| 37 |
To use this model, you must install the GLiNER Python library:
|
|
|
|
| 54 |
text = """
|
| 55 |
Hey, just a quick update — I talked to David yesterday.
|
| 56 |
He sent over the files from his private email ([email protected]), and we should be careful with his SSN: 123-45-6789.
|
|
|
|
| 57 |
He mentioned his new address is 123 Maple Street in New York.
|
| 58 |
His PC adress is 192.168.1.100.
|
| 59 |
"""
|
|
|
|
| 61 |
labels = ["name",
|
| 62 |
"email",
|
| 63 |
"ssn",
|
|
|
|
| 64 |
"street_address",
|
| 65 |
"date",
|
| 66 |
"ipv4"]
|
|
|
|
| 76 |
yesterday => date => 0.9482080340385437
|
| 77 |
[email protected] => email => 0.9911587834358215
|
| 78 |
123-45-6789 => ssn => 0.8612598180770874
|
|
|
|
| 79 |
123 Maple Street in New York => street_address => 0.9869663715362549
|
| 80 |
192.168.1.100 => ipv4 => 0.9810121059417725
|
| 81 |
```
|