Sentence Selection ORPO LoRA
Fine-tuned LoRA adapter for sentence selection in debate contexts.
Training Details
- Base Model: Qwen3-30B-A3B (via SFT fine-tuned version)
- Method: ORPO (Odds Ratio Preference Optimization)
- Task: Select relevant sentence IDs from academic texts to support claims
- F1 Score: 0.247 on holdout set
Usage
from peft import PeftModel
from transformers import AutoModelForCausalLM
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3-30B-A3B")
model = PeftModel.from_pretrained(base_model, "debaterhub/sentence-selection-orpo-lora")
Key Findings
- Format-consistent DPO pairs essential for ORPO training
- 2 epochs optimal (more causes overfitting)
- Noise augmentation fixes positional bias in training data
- Downloads last month
- 92
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
🙋
Ask for provider support