PII-Shield / README.md
mlninad's picture
Update README.md
1aa529e verified
metadata
tags:
  - text-generation-inference
  - transformers
  - llama
  - trl
license: apache-2.0
language:
  - en
base_model:
  - meta-llama/Llama-3.2-3B-Instruct

๐Ÿ›ก๏ธ PII-Shield

Your Intelligent Guardian for Personal Data Protection

License Made with โค๏ธ


๐ŸŒŸ What is PII-Shield?

PII-Shield is your cutting-edge solution for protecting sensitive information in text data. Powered by advanced transformer architecture, it's your first line of defense against unintended PII exposure.


๐ŸŽฏ Core Capabilities

๐Ÿ” Smart Detection

"Regular text with [email protected]" โ†’ "Regular text with [EMAIL_1]"

๐ŸŽญ Intelligent Masking

"Call John at (555) 123-4567" โ†’ "Call [PERSON_1] at [PHONE_1]"

๐Ÿ“Š Structured Mapping

Original โ†’ Masked โ†’ JSON Mapping

๐Ÿš€ Model Architecture

๐Ÿง  Two-Stage Intelligence

image/png


โšก Supported PII Categories

Category Icon Example
Names ๐Ÿ‘ค John Smith
Emails ๐Ÿ“ง [email protected]
Phones ๐Ÿ“ฑ (555) 123-4567
Addresses ๐Ÿ  123 Privacy St
SSN ๐Ÿ”ข XXX-XX-XXXX
Credit Cards ๐Ÿ’ณ XXXX-XXXX-XXXX
DOB ๐Ÿ“… MM/DD/YYYY
IPs ๐ŸŒ 192.168.1.1

๐Ÿ’ซ How It Works

๐ŸŽฏ Detection Phase

def detect_pii(text: str) -> List[Entity]:
    """
    ๐Ÿ” Intelligent PII detection
    Returns list of identified entities
    """
    pass

๐ŸŽญ Masking Phase

def mask_pii(text: str, entities: List[Entity]) -> Dict:
    """
    ๐Ÿ›ก๏ธ Smart PII masking
    Returns masked text and mapping
    """
    pass

๐ŸŽฎ Input/Output

๐Ÿ“ฅ Input Format

{
    "text": "Your sensitive text here",
    "options": {
        "mask_format": "[TYPE_INDEX]",
        "return_mapping": true
    }
}

๐Ÿ“ค Output Format

{
    "masked_text": "Your [TYPE_1] text here",
    "pii_mapping": [
        {
            "label": "TYPE",
            "value": "sensitive",
            "index": 1
        }
    ]
}

๐Ÿšฆ Performance Stats

Metric Score Trend
Precision 98.5% โฌ†๏ธ
Recall 97.8% โฌ†๏ธ
Speed 2ms/req โฌ‡๏ธ
Accuracy 99.1% โžก๏ธ

๐Ÿ› ๏ธ Technical Requirements

  • ๐Ÿ–ฅ๏ธ CUDA-capable GPU
  • ๐Ÿ’พ 8GB+ VRAM
  • ๐Ÿ Python 3.8+
  • ๐Ÿ”ง PyTorch 2.0+

๐Ÿ”’ Security First

image/png

๐ŸŽฏ Best Practices

  1. ๐Ÿ” Never store raw PII
  2. ๐Ÿ’พ Process in-memory only
  3. ๐Ÿงน Clear cache regularly
  4. ๐Ÿ“ Enable access logging
  5. ๐Ÿ”„ Regular updates

โš ๏ธ Known Limitations

  • ๐Ÿ“ Max 2048 tokens
  • ๐Ÿ—ฃ๏ธ English-primary
  • ๐Ÿ’ก Domain adaptation needed
  • ๐Ÿ’พ GPU memory bound

๐Ÿ“œ License

Apache License 2.0 โ€ข Made with โค๏ธ for Privacy


๐Ÿค Support & Community