mmBERT-32K Feedback Detector (LoRA)

A 4-class user feedback classifier fine-tuned from mmbert-32k-yarn using LoRA (Low-Rank Adaptation).

Model Description

This model classifies user messages into 4 feedback categories to help conversational AI systems understand user satisfaction and respond appropriately:

Label	ID	Description
SAT	0	User is satisfied with the response
NEED_CLARIFICATION	1	User needs more explanation or details
WRONG_ANSWER	2	User indicates the response was incorrect
WANT_DIFFERENT	3	User wants an alternative approach/answer

Performance

Validation Results (2,985 samples):

Metric	Value
Accuracy	98.83%
F1 (macro)	98.24%
F1 (weighted)	98.83%

Per-Class Performance:

Class	Precision	Recall	F1-Score	Support
SAT	1.0000	1.0000	1.0000	1,491
NEED_CLARIFICATION	0.9980	0.9980	0.9980	498
WRONG_ANSWER	0.9604	0.9739	0.9671	498
WANT_DIFFERENT	0.9715	0.9578	0.9646	498

Usage

With PEFT (Recommended)

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    "llm-semantic-router/mmbert-32k-yarn",
    num_labels=4
)
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert-32k-yarn")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "llm-semantic-router/mmbert32k-feedback-detector-lora")
model.eval()

# Inference
labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]
text = "I don't understand your explanation, can you clarify?"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()

print(f"Feedback: {labels[prediction]}")  # Output: NEED_CLARIFICATION

Using Merged Model (No PEFT required)

For easier deployment, use the merged version:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)
tokenizer = AutoTokenizer.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)

Training Details

Hyperparameters

Parameter	Value
Base Model	llm-semantic-router/mmbert-32k-yarn
LoRA Rank	64
LoRA Alpha	128
LoRA Dropout	0.1
Target Modules	attn.Wqkv, attn.Wo, mlp.Wi, mlp.Wo
Learning Rate	2e-5
Batch Size	16
Epochs	10 (early stopping at ~5.4)
Warmup Ratio	0.1
Weight Decay	0.01
Precision	bf16
Optimizer	AdamW

Training Data

Trained on llm-semantic-router/feedback-detector-dataset:

Training samples: 17,896 (balanced across 4 classes)
Validation samples: 2,985

Hardware

GPU: AMD Instinct MI300X (192GB HBM3)
Training Time: ~10 minutes
Framework: PyTorch 2.x with ROCm

Multilingual Support

The model inherits multilingual capabilities from mmbert-32k-yarn (Glot500 tokenizer supporting 1800+ languages). Best performance on:

English (primary)
Chinese (Simplified/Traditional)
French
Spanish

Limitations

The SAT class has the strongest performance; some edge cases between WRONG_ANSWER and WANT_DIFFERENT may be ambiguous
Phrases like "That's perfect, no more questions" may sometimes be misclassified
Best suited for conversational AI feedback detection, not general sentiment analysis

Citation

@misc{mmbert32k-feedback-detector,
  title={mmBERT-32K Feedback Detector},
  author={LLM Semantic Router Team},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/llm-semantic-router/mmbert32k-feedback-detector-lora}
}