YAML Metadata Warning: empty or missing yaml metadata in repo card

Check out the documentation for more information.

MCPDial Best Checkpoints

This repository contains the best performing checkpoints from hyperparameter search experiments.

Models

Llama (llama_lr_3e5_r48_a16)

  • BLEU: 0.0586
  • ROUGE-1: 0.2939
  • ROUGE-L: 0.2163
  • Distinct-1/2: 0.2199 / 0.6315
  • Average Length: 34.6 tokens
  • Hyperparameters:
    • Learning Rate: 3e-5
    • LoRA Rank: 48
    • LoRA Alpha: 16

Qwen (qwen_lr_1e5_r128_a32)

  • BLEU: 0.0182
  • ROUGE-1: 0.1577
  • ROUGE-L: 0.1074
  • Distinct-1/2: 0.0988 / 0.3736
  • Average Length: 176.8 tokens
  • Hyperparameters:
    • Learning Rate: 1e-5
    • LoRA Rank: 128
    • LoRA Alpha: 32

Usage

Load the adapters using PEFT:

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

# For Llama
base_model = AutoModelForCausalLM.from_pretrained("meta-llama/Llama-3.2-3B-Instruct")
model = PeftModel.from_pretrained(base_model, "YoungerWu/11667-mcpdial-best-checkpoints", subfolder="llama_lr_3e5_r48_a16")

# For Qwen
base_model = AutoModelForCausalLM.from_pretrained("Qwen/Qwen2.5-7B-Instruct")
model = PeftModel.from_pretrained(base_model, "YoungerWu/11667-mcpdial-best-checkpoints", subfolder="qwen_lr_1e5_r128_a32")

Project

Part of CMU 11-667 Final Project on multi-turn dialogue generation.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support