Instructions to use King-Harry/NinjaMasker-PII-Redaction with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use King-Harry/NinjaMasker-PII-Redaction with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="King-Harry/NinjaMasker-PII-Redaction")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("King-Harry/NinjaMasker-PII-Redaction") model = AutoModelForCausalLM.from_pretrained("King-Harry/NinjaMasker-PII-Redaction") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use King-Harry/NinjaMasker-PII-Redaction with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "King-Harry/NinjaMasker-PII-Redaction" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "King-Harry/NinjaMasker-PII-Redaction", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/King-Harry/NinjaMasker-PII-Redaction
- SGLang
How to use King-Harry/NinjaMasker-PII-Redaction with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "King-Harry/NinjaMasker-PII-Redaction" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "King-Harry/NinjaMasker-PII-Redaction", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "King-Harry/NinjaMasker-PII-Redaction" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "King-Harry/NinjaMasker-PII-Redaction", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use King-Harry/NinjaMasker-PII-Redaction with Docker Model Runner:
docker model run hf.co/King-Harry/NinjaMasker-PII-Redaction
π€ About me β’π± Harry.vc β’ π¦ X.com β’ π Papers
π₯· Model Card for King-Harry/NinjaMasker-PII-Redaction
This model is designed for the redaction and masking of Personally Identifiable Information (PII) in complex text scenarios like call transcripts.
News
- π₯π₯π₯[2023/10/06] Building New Dataset creating a significantly improved dataset, fixing stop tokens.
- π₯π₯π₯[2023/10/05] NinjaMasker-PII-Redaction version 1, was released.
Model Details
π Model Description
This model aims to handle complex and difficult instances of PII redaction that traditional classification models struggle with.
- Developed by: Harry Roy McLaughlin
- Model type: Fine-tuned Language Model
- Language(s) (NLP): English
- License: TBD
- Finetuned from the model: NousResearch/Llama-2-7b-chat-hf
π± Model Sources
- Repository: Hosted on HuggingFace
- Demo: Coming soon
π§ͺ Test the model
Log into HuggingFace (if not already)
!pip install transformers
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, logging
from huggingface_hub import notebook_login
notebook_login()
Load Model
from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline, logging
# Ignore warnings
logging.set_verbosity(logging.CRITICAL)
# Load the model and tokenizer with authentication token
model_name = "King-Harry/NinjaMasker-PII-Redaction"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name)
Generate Text
# Generate text
pipe = pipeline(task="text-generation", model=model, tokenizer=tokenizer, max_length=100)
prompt = "My name is Harry and I live in Winnipeg. My phone number is ummm 204 no 203, ahh 4344, no 4355"
result = pipe(f"<s>[INST] {prompt} [/INST]")
# Print the generated text
print(result[0]['generated_text'])
Uses
π― Direct Use
The model is specifically designed for direct redaction and masking of PII in complex text inputs such as call transcripts.
β¬οΈ Downstream Use
The model has potential for numerous downstream applications, though specific use-cases are yet to be fully explored.
β Out-of-Scope Use
The model is under development; use in critical systems requiring 100% accuracy is not recommended at this stage.
βοΈ Bias, Risks, and Limitations
The model is trained only on English text, which may limit its applicability in multilingual or non-English settings.
π Recommendations
Users should be aware of the model's language-specific training and should exercise caution when using it in critical systems.
ποΈ Training Details
π Training Data
The model was trained on a dataset of 43,000 question/answer pairs that contained various forms of PII. There are 63 labels that the model looks for.
βοΈ Training Hyperparameters
- Training regime: FP16
π Speeds, Sizes, Times
- Hardware: T4 GPU
- Cloud Provider: Google CoLab Pro (for the extra RAM)
- Training Duration: ~4 hours
π Evaluation
Evaluation is pending.
π Environmental Impact
Given the significant computing resources used, the model likely has a substantial carbon footprint. Exact calculations are pending.
- Hardware Type: T4 GPU
- Hours used: ~4
- Cloud Provider: Google CoLab Pro
π Technical Specifications
ποΈ Model Architecture and Objective
The model is a fine-tuned version of LLama 2 7b, tailored for PII redaction tasks.
π₯οΈ Hardware
- Training Hardware: T4 GPU (with extra RAM)
πΎ Software
Environment: Google CoLab Pro
πͺ Disclaimer
This model is in its first generation and will be updated rapidly.
βοΈ Model Card Authors
π Model Card Contact
- Downloads last month
- -