PII-Shield / README.md
mlninad's picture
Update README.md
1aa529e verified
---
tags:
- text-generation-inference
- transformers
- llama
- trl
license: apache-2.0
language:
- en
base_model:
- meta-llama/Llama-3.2-3B-Instruct
---
# ๐Ÿ›ก๏ธ PII-Shield
> *Your Intelligent Guardian for Personal Data Protection*
[![License](https://img.shields.io/badge/License-Apache%202.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)
[![Made with โค๏ธ](https://img.shields.io/badge/Made%20with%20%E2%9D%A4%EF%B8%8F%20by-Data%20Privacy%20Team-red.svg)](https://huggingface.co/mlninad/PII-Shield)
---
## ๐ŸŒŸ What is PII-Shield?
PII-Shield is your cutting-edge solution for protecting sensitive information in text data. Powered by advanced transformer architecture, it's your first line of defense against unintended PII exposure.
---
## ๐ŸŽฏ Core Capabilities
### ๐Ÿ” Smart Detection
```bash
"Regular text with [email protected]" โ†’ "Regular text with [EMAIL_1]"
```
### ๐ŸŽญ Intelligent Masking
```bash
"Call John at (555) 123-4567" โ†’ "Call [PERSON_1] at [PHONE_1]"
```
### ๐Ÿ“Š Structured Mapping
```bash
Original โ†’ Masked โ†’ JSON Mapping
```
---
## ๐Ÿš€ Model Architecture
### ๐Ÿง  Two-Stage Intelligence
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6434f4fe7b8247480110bf44/vSSoxcNxuIoAN-mI3aJvt.png)
---
## โšก Supported PII Categories
| Category | Icon | Example |
|----------|------|---------|
| Names | ๐Ÿ‘ค | John Smith |
| Emails | ๐Ÿ“ง | [email protected] |
| Phones | ๐Ÿ“ฑ | (555) 123-4567 |
| Addresses | ๐Ÿ  | 123 Privacy St |
| SSN | ๐Ÿ”ข | XXX-XX-XXXX |
| Credit Cards | ๐Ÿ’ณ | XXXX-XXXX-XXXX |
| DOB | ๐Ÿ“… | MM/DD/YYYY |
| IPs | ๐ŸŒ | 192.168.1.1 |
---
## ๐Ÿ’ซ How It Works
### ๐ŸŽฏ Detection Phase
```python
def detect_pii(text: str) -> List[Entity]:
"""
๐Ÿ” Intelligent PII detection
Returns list of identified entities
"""
pass
```
### ๐ŸŽญ Masking Phase
```python
def mask_pii(text: str, entities: List[Entity]) -> Dict:
"""
๐Ÿ›ก๏ธ Smart PII masking
Returns masked text and mapping
"""
pass
```
---
## ๐ŸŽฎ Input/Output
### ๐Ÿ“ฅ Input Format
```json
{
"text": "Your sensitive text here",
"options": {
"mask_format": "[TYPE_INDEX]",
"return_mapping": true
}
}
```
### ๐Ÿ“ค Output Format
```json
{
"masked_text": "Your [TYPE_1] text here",
"pii_mapping": [
{
"label": "TYPE",
"value": "sensitive",
"index": 1
}
]
}
```
---
## ๐Ÿšฆ Performance Stats
| Metric | Score | Trend |
|--------|-------|-------|
| Precision | 98.5% | โฌ†๏ธ |
| Recall | 97.8% | โฌ†๏ธ |
| Speed | 2ms/req | โฌ‡๏ธ |
| Accuracy | 99.1% | โžก๏ธ |
---
## ๐Ÿ› ๏ธ Technical Requirements
- ๐Ÿ–ฅ๏ธ CUDA-capable GPU
- ๐Ÿ’พ 8GB+ VRAM
- ๐Ÿ Python 3.8+
- ๐Ÿ”ง PyTorch 2.0+
---
## ๐Ÿ”’ Security First
![image/png](https://cdn-uploads.huggingface.co/production/uploads/6434f4fe7b8247480110bf44/z0zmLQATLKRNHFoA4nOZK.png)
## ๐ŸŽฏ Best Practices
1. ๐Ÿ” Never store raw PII
2. ๐Ÿ’พ Process in-memory only
3. ๐Ÿงน Clear cache regularly
4. ๐Ÿ“ Enable access logging
5. ๐Ÿ”„ Regular updates
---
## โš ๏ธ Known Limitations
- ๐Ÿ“ Max 2048 tokens
- ๐Ÿ—ฃ๏ธ English-primary
- ๐Ÿ’ก Domain adaptation needed
- ๐Ÿ’พ GPU memory bound
---
## ๐Ÿ“œ License
Apache License 2.0 โ€ข Made with โค๏ธ for Privacy
---
## ๐Ÿค Support & Community
- ๐Ÿ’ฌ [LinkedIn](https://www.linkedin.com/in/workwithninad)
- ๐Ÿ“ง [Email Support](mailto:[email protected])