Post
2171
Perplexity released a dataset (BrowseSafe) and benchmark to catch and prevent malicious prompt-injection instructions in real-time.
We trained a prompt injection classifier on BrowseSafe using adaptive-classifier with ModernBERT-base embeddings.
74.9% F1 on detecting prompt injection in web content.
Model -> adaptive-classifier/browsesafe
Dataset -> perplexity-ai/browsesafe-bench
Repo -> https://github.com/codelion/adaptive-classifier
We trained a prompt injection classifier on BrowseSafe using adaptive-classifier with ModernBERT-base embeddings.
74.9% F1 on detecting prompt injection in web content.
Model -> adaptive-classifier/browsesafe
Dataset -> perplexity-ai/browsesafe-bench
Repo -> https://github.com/codelion/adaptive-classifier