YAML Metadata Warning: empty or missing yaml metadata in repo card
Check out the documentation for more information.
MCPDial Best Checkpoints
This repository contains the best performing checkpoints from hyperparameter search experiments.
Models
Llama (llama_lr_3e5_r48_a16)
- BLEU: 0.0586
- ROUGE-1: 0.2939
- ROUGE-L: 0.2163
- Distinct-1/2: 0.2199 / 0.6315
- Average Length: 34.6 tokens
- Hyperparameters:
- Learning Rate: 3e-5
- LoRA Rank: 48
- LoRA Alpha: 16
Qwen (qwen_lr_1e5_r128_a32)
- BLEU: 0.0182
- ROUGE-1: 0.1577
- ROUGE-L: 0.1074
- Distinct-1/2: 0.0988 / 0.3736
- Average Length: 176.8 tokens
- Hyperparameters:
- Learning Rate: 1e-5
- LoRA Rank: 128
- LoRA Alpha: 32
Usage
Load the adapters using PEFT:
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer
# For Llama
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
model = PeftModel.from_pretrained(base_model, "YoungerWu/11667-mcpdial-best-checkpoints", subfolder="llama_lr_3e5_r48_a16")
# For Qwen
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "YoungerWu/11667-mcpdial-best-checkpoints", subfolder="qwen_lr_1e5_r128_a32")
Project
Part of CMU 11-667 Final Project on multi-turn dialogue generation.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support