mmBERT-32K Feedback Detector (LoRA)

A 4-class user feedback classifier fine-tuned from mmbert-32k-yarn using LoRA (Low-Rank Adaptation).

Model Description

This model classifies user messages into 4 feedback categories to help conversational AI systems understand user satisfaction and respond appropriately:

Label ID Description
SAT 0 User is satisfied with the response
NEED_CLARIFICATION 1 User needs more explanation or details
WRONG_ANSWER 2 User indicates the response was incorrect
WANT_DIFFERENT 3 User wants an alternative approach/answer

Performance

Validation Results (2,985 samples):

Metric Value
Accuracy 98.83%
F1 (macro) 98.24%
F1 (weighted) 98.83%

Per-Class Performance:

Class Precision Recall F1-Score Support
SAT 1.0000 1.0000 1.0000 1,491
NEED_CLARIFICATION 0.9980 0.9980 0.9980 498
WRONG_ANSWER 0.9604 0.9739 0.9671 498
WANT_DIFFERENT 0.9715 0.9578 0.9646 498

Usage

With PEFT (Recommended)

from transformers import AutoModelForSequenceClassification, AutoTokenizer
from peft import PeftModel

# Load base model
base_model = AutoModelForSequenceClassification.from_pretrained(
    "llm-semantic-router/mmbert-32k-yarn",
    num_labels=4
)
tokenizer = AutoTokenizer.from_pretrained("llm-semantic-router/mmbert-32k-yarn")

# Load LoRA adapter
model = PeftModel.from_pretrained(base_model, "llm-semantic-router/mmbert32k-feedback-detector-lora")
model.eval()

# Inference
labels = ["SAT", "NEED_CLARIFICATION", "WRONG_ANSWER", "WANT_DIFFERENT"]
text = "I don't understand your explanation, can you clarify?"

inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
outputs = model(**inputs)
prediction = outputs.logits.argmax(-1).item()

print(f"Feedback: {labels[prediction]}")  # Output: NEED_CLARIFICATION

Using Merged Model (No PEFT required)

For easier deployment, use the merged version:

from transformers import AutoModelForSequenceClassification, AutoTokenizer

model = AutoModelForSequenceClassification.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)
tokenizer = AutoTokenizer.from_pretrained(
    "llm-semantic-router/mmbert32k-feedback-detector-merged"
)

Training Details

Hyperparameters

Parameter Value
Base Model llm-semantic-router/mmbert-32k-yarn
LoRA Rank 64
LoRA Alpha 128
LoRA Dropout 0.1
Target Modules attn.Wqkv, attn.Wo, mlp.Wi, mlp.Wo
Learning Rate 2e-5
Batch Size 16
Epochs 10 (early stopping at ~5.4)
Warmup Ratio 0.1
Weight Decay 0.01
Precision bf16
Optimizer AdamW

Training Data

Trained on llm-semantic-router/feedback-detector-dataset:

  • Training samples: 17,896 (balanced across 4 classes)
  • Validation samples: 2,985

Hardware

  • GPU: AMD Instinct MI300X (192GB HBM3)
  • Training Time: ~10 minutes
  • Framework: PyTorch 2.x with ROCm

Multilingual Support

The model inherits multilingual capabilities from mmbert-32k-yarn (Glot500 tokenizer supporting 1800+ languages). Best performance on:

  • English (primary)
  • Chinese (Simplified/Traditional)
  • French
  • Spanish

Limitations

  • The SAT class has the strongest performance; some edge cases between WRONG_ANSWER and WANT_DIFFERENT may be ambiguous
  • Phrases like "That's perfect, no more questions" may sometimes be misclassified
  • Best suited for conversational AI feedback detection, not general sentiment analysis

Citation

@misc{mmbert32k-feedback-detector,
  title={mmBERT-32K Feedback Detector},
  author={LLM Semantic Router Team},
  year={2026},
  publisher={Hugging Face},
  url={https://huggingface.co/llm-semantic-router/mmbert32k-feedback-detector-lora}
}

License

Apache 2.0

Framework Versions

  • PEFT: 0.18.1
  • Transformers: 4.48+
  • PyTorch: 2.6+
Downloads last month
122
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for llm-semantic-router/mmbert32k-feedback-detector-lora

Adapter
(5)
this model

Dataset used to train llm-semantic-router/mmbert32k-feedback-detector-lora

Collection including llm-semantic-router/mmbert32k-feedback-detector-lora

Evaluation results