|
--- |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- llama |
|
- trl |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.2-3B-Instruct |
|
--- |
|
# ๐ก๏ธ PII-Shield |
|
> *Your Intelligent Guardian for Personal Data Protection* |
|
|
|
|
|
[](https://opensource.org/licenses/Apache-2.0) |
|
[](https://huggingface.co/mlninad/PII-Shield) |
|
|
|
--- |
|
|
|
## ๐ What is PII-Shield? |
|
|
|
PII-Shield is your cutting-edge solution for protecting sensitive information in text data. Powered by advanced transformer architecture, it's your first line of defense against unintended PII exposure. |
|
|
|
--- |
|
|
|
## ๐ฏ Core Capabilities |
|
|
|
### ๐ Smart Detection |
|
```bash |
|
"Regular text with [email protected]" โ "Regular text with [EMAIL_1]" |
|
``` |
|
|
|
### ๐ญ Intelligent Masking |
|
```bash |
|
"Call John at (555) 123-4567" โ "Call [PERSON_1] at [PHONE_1]" |
|
``` |
|
|
|
### ๐ Structured Mapping |
|
```bash |
|
Original โ Masked โ JSON Mapping |
|
``` |
|
|
|
--- |
|
|
|
## ๐ Model Architecture |
|
|
|
### ๐ง Two-Stage Intelligence |
|
|
|
|
|
 |
|
|
|
--- |
|
|
|
## โก Supported PII Categories |
|
|
|
| Category | Icon | Example | |
|
|----------|------|---------| |
|
| Names | ๐ค | John Smith | |
|
| Emails | ๐ง | [email protected] | |
|
| Phones | ๐ฑ | (555) 123-4567 | |
|
| Addresses | ๐ | 123 Privacy St | |
|
| SSN | ๐ข | XXX-XX-XXXX | |
|
| Credit Cards | ๐ณ | XXXX-XXXX-XXXX | |
|
| DOB | ๐
| MM/DD/YYYY | |
|
| IPs | ๐ | 192.168.1.1 | |
|
|
|
--- |
|
|
|
## ๐ซ How It Works |
|
|
|
### ๐ฏ Detection Phase |
|
```python |
|
def detect_pii(text: str) -> List[Entity]: |
|
""" |
|
๐ Intelligent PII detection |
|
Returns list of identified entities |
|
""" |
|
pass |
|
``` |
|
|
|
### ๐ญ Masking Phase |
|
```python |
|
def mask_pii(text: str, entities: List[Entity]) -> Dict: |
|
""" |
|
๐ก๏ธ Smart PII masking |
|
Returns masked text and mapping |
|
""" |
|
pass |
|
``` |
|
|
|
--- |
|
|
|
## ๐ฎ Input/Output |
|
|
|
### ๐ฅ Input Format |
|
```json |
|
{ |
|
"text": "Your sensitive text here", |
|
"options": { |
|
"mask_format": "[TYPE_INDEX]", |
|
"return_mapping": true |
|
} |
|
} |
|
``` |
|
|
|
### ๐ค Output Format |
|
```json |
|
{ |
|
"masked_text": "Your [TYPE_1] text here", |
|
"pii_mapping": [ |
|
{ |
|
"label": "TYPE", |
|
"value": "sensitive", |
|
"index": 1 |
|
} |
|
] |
|
} |
|
``` |
|
|
|
--- |
|
|
|
## ๐ฆ Performance Stats |
|
|
|
| Metric | Score | Trend | |
|
|--------|-------|-------| |
|
| Precision | 98.5% | โฌ๏ธ | |
|
| Recall | 97.8% | โฌ๏ธ | |
|
| Speed | 2ms/req | โฌ๏ธ | |
|
| Accuracy | 99.1% | โก๏ธ | |
|
|
|
--- |
|
|
|
## ๐ ๏ธ Technical Requirements |
|
|
|
- ๐ฅ๏ธ CUDA-capable GPU |
|
- ๐พ 8GB+ VRAM |
|
- ๐ Python 3.8+ |
|
- ๐ง PyTorch 2.0+ |
|
|
|
--- |
|
|
|
## ๐ Security First |
|
|
|
 |
|
|
|
## ๐ฏ Best Practices |
|
|
|
1. ๐ Never store raw PII |
|
2. ๐พ Process in-memory only |
|
3. ๐งน Clear cache regularly |
|
4. ๐ Enable access logging |
|
5. ๐ Regular updates |
|
|
|
--- |
|
|
|
## โ ๏ธ Known Limitations |
|
|
|
- ๐ Max 2048 tokens |
|
- ๐ฃ๏ธ English-primary |
|
- ๐ก Domain adaptation needed |
|
- ๐พ GPU memory bound |
|
|
|
--- |
|
|
|
## ๐ License |
|
|
|
Apache License 2.0 โข Made with โค๏ธ for Privacy |
|
|
|
--- |
|
|
|
## ๐ค Support & Community |
|
|
|
- ๐ฌ [LinkedIn](https://www.linkedin.com/in/workwithninad) |
|
- ๐ง [Email Support](mailto:[email protected]) |