DeBERTa-v3 Smishing & Spam Detector — v0.2.2

A high-precision, robust SMS/MMS spam detector fine-tuned for real-world deployment.

Trained on approximately 150,000 English messages (balanced at a 3:1 benign-to-spam ratio).
Fixes critical failure modes from v0.1 through architectural and training innovations.

What Is New in v0.2.2

v0.2.2 is the trained, production-ready release of the v0.2 architecture. This version incorporates systematic error analysis (624 misclassified samples) and targeted improvements to address:

Overconfident false positives on legitimate promotional messages
Missed spam due to obfuscation, truncation, or feature poverty

Compared with v0.1, this release achieves significant improvements in precision while maintaining high recall.

Performance Summary

Test Results on 38,331 Hold-Out Samples

Best model checkpoint: Epoch 8 (saved at model_output_v2.2/checkpoint_epoch8.pt)
Optimal classification threshold (determined via validation): 0.720

At Optimized Threshold (Recommended for Deployment)

Class	Precision	Recall	F1-Score
Benign	0.98	0.99	0.98
Spam/Smishing	0.92	0.90	0.91
Accuracy (overall)	—	—	0.97

Overall Metrics @ Optimized Threshold (0.72)

Metric	Value
F1	0.9096
Precision	0.9170
Recall	0.9023
AUC-ROC	0.9883

Confusion Matrix

	Predicted Benign	Predicted Spam
Actual Benign	32,131	468
Actual Spam	560	5,172

At Default Threshold (0.500) — Reference Only

Class	Precision	Recall	F1-Score
Benign	0.99	0.96	0.97
Spam/Smishing	0.79	0.95	0.86

Overall Metrics @ Threshold = 0.50 (Default)

Metric	Value
F1	0.8619
Precision	0.7901
Recall	0.9480
AUC-ROC	0.9883

Training and Validation Metrics

Epoch	Train Loss	Val F1 (Optimal)	Threshold	AUC-ROC
1	0.2453	0.8781	0.735	0.9823
2	0.2204	0.8942	0.710	0.9857
3	0.2183	0.8920	0.675	0.9846
4	0.2180	0.8991	0.675	0.9863
5	0.2156	0.8976	0.725	0.9844
6	0.2159	0.9037	0.695	0.9870
7	0.2158	0.9016	0.675	0.9864
8	0.2147	0.9075	0.720	0.9883

Total training time: 166 minutes
Hardware: 4 x NVIDIA RTX 3090 GPUs
Gradient checkpointing enabled (memory-efficient training)

Architecture Highlights

Model Structure

Input SMS/MMS Text
   → DeBERTa-v3-base Encoder (89,065,216 parameters)
      ├─ [CLS] token embedding (768 dimensions)  
      └─ Attention-weighted pooling over all tokens (768 dimensions)
   → 23 Engineered Features  
      → Linear(23 → 128) + LayerNorm + GELU + Dropout (1,554,930 parameters)  
         ↓
Combined representation: [CLS] ∥ Attention-Pooled ∥ Feature Embedding = 1,664 dimensions  
   → Bottleneck projection (1,664 → 256) + Residual Block  
   → Final classification head (256 → 2)

Key Improvements Over v0.1

Component	v0.1	v0.2.2
Pooling strategy	[CLS] only	Dual pooling: [CLS] + learned attention pooling
Engineered features	15 basic features	23 features (8 new, targeting evasion patterns)
Feature projection	Linear(15 → 64), no normalization	Linear(23 → 128) + LayerNorm + GELU
Loss function	Cross-Entropy Loss	Focal Loss (γ=2) + label smoothing (ε=0.05)
Threshold selection	Fixed at 0.6993	Per-epoch optimization on validation set; final: 0.720

Engineered Features

Original 15 Features (v0.1)

char_count
word_count
avg_word_length
uppercase_ratio
digit_ratio
special_char_ratio
exclamation_count
question_mark_count
has_url
url_count
has_shortened_url
has_phone_number
has_email
has_currency
urgency_score

New 8 Features (v0.2)

unicode_ratio — Detects Unicode substitution (e.g., "Vérífy yøur àccount")
char_entropy — Measures character distribution randomness (low entropy = template spam)
suspicious_spacing — Counts spaced-out patterns like "m e s s a g e"
leet_ratio — Detects leetspeak substitutions (e.g., "l0g1n", "@dDr355")
max_digit_run — Longest consecutive digit sequence (useful for OTP detection)
repeated_char_ratio — Ratio of consecutive repeated characters (e.g., "URGENT!!!")
vocab_richness — Unique words / total words (low = template spam)
has_obfuscated_url — Regex-based detection of broken URLs (e.g., "httpscluesjdko", spaced domains)

These new features specifically target the 283 false negatives observed in v0.1, especially short, feature-poor, or obfuscated messages.

Usage Example

Installation

pip install torch transformers scikit-learn joblib sentencepiece huggingface_hub

Inference Code

import re
import math
import json
import torch
import numpy as np
from collections import Counter
from transformers import AutoTokenizer, AutoModel
from huggingface_hub import hf_hub_download
import joblib

# --- Feature extraction functions (must match training) ---

URGENCY_WORDS = {
    "urgent", "immediately", "expires", "verify", "confirm", "suspended",
    "locked", "alert", "action required", "limited time", "click here",
    "act now", "final notice", "winner", "prize", "claim", "free",
    "blocked", "deactivated", "unusual activity"
}

URL_PATTERN = re.compile(r'(https?://|www\.)\S+|\w+\.(com|net|org|io|co|uk)', re.I)
SHORTENED_DOMAINS = {"bit.ly","tinyurl.com","goo.gl","t.co","ow.ly","smsg.io","rb.gy"}
PHONE_PATTERN = re.compile(r'(\+?\d[\d\s\-().]{7,}\d)')
EMAIL_PATTERN = re.compile(r'[\w.+-]+@[\w-]+\.[a-z]{2,}', re.I)
CURRENCY_PATTERN = re.compile(r'[$£€₹¥]|(usd|gbp|eur|inr)', re.I)

LEET_MAP = str.maketrans("013457@!", "oieastai")
OBFUSCATED_URL = re.compile(
    r"(https?(?:clue|[a-z]{4,}[a-z0-9]{2,})\b)"
    r"|(?:h\s*t\s*t\s*p)"
    r"|(?:www\s*\.\s*\w)"
    r"|(?:\w+\s*\.\s*(?:com|net|org|xyz|info|co)\b)", re.I)
SPACED_WORD = re.compile(r"\b(?:\w\s){3,}\w\b")


def extract_features(text):
    words = text.split()
    letters = [c for c in text if c.isalpha()]
    chars = list(text)
    n = len(chars)

    original = [
        len(text),                      # char_count
        len(words),                     # word_count
        sum(len(w) for w in words) / max(len(words), 1),
        sum(1 for c in letters if c.isupper()) / max(len(letters), 1),
        sum(1 for c in text if c.isdigit()) / max(len(text), 1),
        sum(1 for c in text if not c.isalnum() and not c.isspace()) / max(len(text), 1),
        text.count('!'),                # exclamation_count
        text.count('?'),                # question_mark_count
        int(bool(URL_PATTERN.search(text))),
        len(URL_PATTERN.findall(text)),
        int(any(d in text.lower() for d in SHORTENED_DOMAINS)),
        int(bool([m for m in PHONE_PATTERN.findall(text) if len(re.sub(r'\D','',m)) >= 7])),
        int(bool(EMAIL_PATTERN.search(text))),
        int(bool(CURRENCY_PATTERN.search(text))),
        sum(1 for w in URGENCY_WORDS if w in text.lower()),
    ]

    counts = Counter(text.lower())
    entropy = -sum((c/n) * math.log2(c/n) for c in counts.values() if c > 0) if n > 0 else 0.0
    translated = text.translate(LEET_MAP)
    leet_changes = sum(1 for a, b in zip(text, translated) if a != b)
    max_drun, cur = 0, 0
    for c in chars:
        if c.isdigit(): 
            cur += 1
            max_drun = max(max_drun, cur)
        else: 
            cur = 0

    repeats = sum(1 for i in range(1, n) if chars[i] == chars[i-1]) if n > 1 else 0

    new_features = [
        sum(1 for c in chars if ord(c) > 127) / max(n, 1),         # unicode_ratio
        entropy,                                                     # char_entropy
        len(SPACED_WORD.findall(text)),                              # suspicious_spacing
        leet_changes / max(n, 1),                                    # leet_ratio
        max_drun,                                                    # max_digit_run
        repeats / max(n - 1, 1) if n > 1 else 0.0,                   # repeated_char_ratio
        len(set(w.lower() for w in words)) / max(len(words), 1),     # vocab_richness
        int(bool(OBFUSCATED_URL.search(text))),                      # has_obfuscated_url
    ]

    return original + new_features


# --- Model loading and inference ---

model_id = "notd5a/deberta-v3-malicious-sms-mms-detector"
device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(model_id)
scaler = joblib.load(hf_hub_download(model_id, "scaler.pkl"))

with open(hf_hub_download(model_id, "threshold.json")) as f:
    THRESHOLD = json.load(f)["threshold"]


class AttentionPooling(torch.nn.Module):
    def __init__(self, hidden_size):
        super().__init__()
        self.attention = torch.nn.Sequential(
            torch.nn.Linear(hidden_size, hidden_size),
            torch.nn.Tanh(),
            torch.nn.Linear(hidden_size, 1, bias=False),
        )

    def forward(self, hidden_states, attention_mask):
        scores = self.attention(hidden_states).squeeze(-1)
        scores = scores.masked_fill(attention_mask == 0, float("-inf"))
        weights = torch.softmax(scores, dim=-1).unsqueeze(-1)
        return (hidden_states * weights).sum(dim=1)


class DeBERTaWithFeaturesV2(torch.nn.Module):
    def __init__(self, model_name, num_extra_features=23, num_labels=2, dropout=0.1):
        super().__init__()
        self.deberta = AutoModel.from_pretrained(model_name)
        H = self.deberta.config.hidden_size
        self.attn_pool = AttentionPooling(H)
        feat_dim = 128
        self.feature_proj = torch.nn.Sequential(
            torch.nn.Linear(num_extra_features, feat_dim),
            torch.nn.LayerNorm(feat_dim), 
            torch.nn.GELU(), 
            torch.nn.Dropout(dropout),
        )
        combined_dim = 2 * H + feat_dim
        bottleneck = 256
        self.fc1 = torch.nn.Linear(combined_dim, bottleneck)
        self.ln1 = torch.nn.LayerNorm(bottleneck)
        self.residual_block = torch.nn.Sequential(
            torch.nn.Linear(bottleneck, bottleneck), 
            torch.nn.LayerNorm(bottleneck),
            torch.nn.GELU(), 
            torch.nn.Dropout(dropout),
            torch.nn.Linear(bottleneck, bottleneck), 
            torch.nn.LayerNorm(bottleneck),
        )
        self.dropout = torch.nn.Dropout(dropout)
        self.output_head = torch.nn.Linear(bottleneck, num_labels)

    def forward(self, input_ids, attention_mask, extra_features):
        out = self.deberta(input_ids=input_ids, attention_mask=attention_mask)
        hidden = out.last_hidden_state
        cls_emb = hidden[:, 0, :]
        attn_emb = self.attn_pool(hidden, attention_mask)
        feat = self.feature_proj(extra_features)
        combined = torch.cat([cls_emb, attn_emb, feat], dim=1)
        x = torch.nn.functional.gelu(self.ln1(self.fc1(combined)))
        x = x + self.residual_block(x)
        return self.output_head(self.dropout(x))


model = DeBERTaWithFeaturesV2(model_id)
state_dict = torch.load(hf_hub_download(model_id, "pytorch_model.pt"), map_location=device)
model.load_state_dict(state_dict)
model.to(device).eval()


def predict(texts):
    if isinstance(texts, str):
        texts = [texts]
    
    enc = tokenizer(
        texts, 
        max_length=256, 
        padding="max_length", 
        truncation=True, 
        return_tensors="pt"
    )
    
    raw_feats = np.array([extract_features(t) for t in texts], dtype=np.float32)
    scaled_feats = torch.tensor(scaler.transform(raw_feats), dtype=torch.float32).to(device)
    
    with torch.no_grad():
        logits = model(
            enc["input_ids"].to(device),
            enc["attention_mask"].to(device),
            scaled_feats
        )
        probs = torch.softmax(logits, dim=1)[:, 1].cpu().numpy()
    
    return [
        {
            "text": t,
            "prob_spam": round(float(p), 4),
            "label": int(p >= THRESHOLD),
            "prediction": "spam" if p >= THRESHOLD else "benign"
        }
        for t, p in zip(texts, probs)
    ]


# --- Example usage ---
results = predict([
    "Your account has been suspended. Verify immediately: http://bit.ly/abc123",
    "Hey, are you free for lunch tomorrow?",
    "Y ou've got mail: new messa ge w7",
])

for r in results:
    print(r)

Files Included

File	Description
`pytorch_model.pt`	Full model weights (`DeBERTaWithFeaturesV2`)
`tokenizer/`	Saved DeBERTa-v3-base tokenizer
`scaler.pkl`	StandardScaler for 23 engineered features (fitted during training)
`threshold.json`	Optimized classification threshold (value: 0.720)
`config.json`	DeBERTa base configuration

Limitations

Language: English-only; non-English messages may be misclassified.
Message length: Maximum sequence length is 256 tokens; longer messages are truncated.
Promotional boundary: Legitimate marketing messages with urgency cues (e.g., "30% OFF!") remain challenging.
Evasion tactics: Novel obfuscation techniques not present in training data may reduce performance over time.
No metadata: The model operates on text only — sender reputation, short codes, or carrier signals are not used.

License

CC BY-NC 4.0
Free for research and non-commercial use. Commercial use requires explicit permission.

Contact: ahmadabushawar21@gmail.com

Model version: v0.2.2
Release date: 2026 Training time: 166 minutes on 4×RTX3090

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for notd5a/deberta-v3-malicious-sms-mms-detector-v0.2.2

Base model

microsoft/deberta-v3-base

Finetuned

(549)

this model

notd5a
/

deberta-v3-malicious-sms-mms-detector-v0.2.2