Instructions to use King-Harry/NinjaMasker-PII-Redaction with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use King-Harry/NinjaMasker-PII-Redaction with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="King-Harry/NinjaMasker-PII-Redaction")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("King-Harry/NinjaMasker-PII-Redaction")
model = AutoModelForCausalLM.from_pretrained("King-Harry/NinjaMasker-PII-Redaction")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use King-Harry/NinjaMasker-PII-Redaction with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "King-Harry/NinjaMasker-PII-Redaction"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "King-Harry/NinjaMasker-PII-Redaction",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/King-Harry/NinjaMasker-PII-Redaction

SGLang

How to use King-Harry/NinjaMasker-PII-Redaction with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "King-Harry/NinjaMasker-PII-Redaction" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "King-Harry/NinjaMasker-PII-Redaction",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "King-Harry/NinjaMasker-PII-Redaction" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "King-Harry/NinjaMasker-PII-Redaction",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use King-Harry/NinjaMasker-PII-Redaction with Docker Model Runner:
```
docker model run hf.co/King-Harry/NinjaMasker-PII-Redaction
```

You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

banner

🤗 About me •🐱 Harry.vc • 🐦 X.com • 📃 Papers

🥷 Model Card for King-Harry/NinjaMasker-PII-Redaction

This model is designed for the redaction and masking of Personally Identifiable Information (PII) in complex text scenarios like call transcripts.

News

🔥🔥🔥[2023/10/06] Building New Dataset creating a significantly improved dataset, fixing stop tokens.
🔥🔥🔥[2023/10/05] NinjaMasker-PII-Redaction version 1, was released.

Model Details

📖 Model Description

This model aims to handle complex and difficult instances of PII redaction that traditional classification models struggle with.

Developed by: Harry Roy McLaughlin
Model type: Fine-tuned Language Model
Language(s) (NLP): English
License: TBD
Finetuned from the model: NousResearch/Llama-2-7b-chat-hf

🌱 Model Sources

Repository: Hosted on HuggingFace
Demo: Coming soon

🧪 Test the model

Log into HuggingFace (if not already)

!pip install transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, logging
from huggingface_hub import notebook_login
notebook_login()

Load Model

from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, logging

# Ignore warnings
logging.set_verbosity(logging.CRITICAL)

# Load the model and tokenizer with authentication token
model_name = "King-Harry/NinjaMasker-PII-Redaction"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)

Generate Text

# Generate text
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=100)
prompt = "My name is Harry and I live in Winnipeg. My phone number is ummm 204 no 203, ahh 4344, no 4355"
result = pipe(f"<s>[INST] {prompt} [/INST]")

# Print the generated text
print(result[0]['generated_text'])

Uses

🎯 Direct Use

The model is specifically designed for direct redaction and masking of PII in complex text inputs such as call transcripts.

⬇️ Downstream Use

The model has potential for numerous downstream applications, though specific use-cases are yet to be fully explored.

❌ Out-of-Scope Use

The model is under development; use in critical systems requiring 100% accuracy is not recommended at this stage.

⚖️ Bias, Risks, and Limitations

The model is trained only on English text, which may limit its applicability in multilingual or non-English settings.

👍 Recommendations

Users should be aware of the model's language-specific training and should exercise caution when using it in critical systems.

🏋️ Training Details

📊 Training Data

The model was trained on a dataset of 43,000 question/answer pairs that contained various forms of PII. There are 63 labels that the model looks for.

⚙️ Training Hyperparameters

Training regime: FP16

🚀 Speeds, Sizes, Times

Hardware: T4 GPU
Cloud Provider: Google CoLab Pro (for the extra RAM)
Training Duration: ~4 hours

📋 Evaluation

Evaluation is pending.

🌍 Environmental Impact

Given the significant computing resources used, the model likely has a substantial carbon footprint. Exact calculations are pending.

Hardware Type: T4 GPU
Hours used: ~4
Cloud Provider: Google CoLab Pro

📄 Technical Specifications

🏛️ Model Architecture and Objective

The model is a fine-tuned version of LLama 2 7b, tailored for PII redaction tasks.

🖥️ Hardware

Training Hardware: T4 GPU (with extra RAM)

💾 Software

Environment: Google CoLab Pro
🪖 Disclaimer
This model is in its first generation and will be updated rapidly.

✍️ Model Card Authors

Harry Roy McLaughlin

📞 Model Card Contact

harry.roy@gmail.com

Downloads last month: -

King-Harry
/

NinjaMasker-PII-Redaction