SpecExit ready-for-benchmarking 'draft model'

#4
by Dhia-GB - opened

Hi,
Thank you for the great work on SpecExit! I'm trying to benchmark the method using Qwen3-8B (and Qwen3-14B) as the base model.

Issue

I attempted to use a standard EAGLE3 draft model (AngelSlim/Qwen3-4B_eagle3) with the provided inference code, but encountered a weight shape mismatch:

RuntimeError: Error(s) in loading state_dict for Model: size mismatch for fc.weight: copying a param with shape torch.Size([2560, 7680]) from checkpoint, the shape in current model is torch.Size([2563, 7680]).

This is expected since standard EAGLE3 models don't include the +3 outputs for the CPR (Confidence-Progress-Remain) prediction head required by SpecExit.

Request

Would it be possible to release pre-trained SpecExit draft models that include the CPR head? Specifically:

  1. Qwen3-based models (e.g., Qwen3-8B + compatible draft model)
  2. Any models used in the paper's benchmarks (for reproduction purposes)

Alternatively, if releasing full models isn't feasible, could you provide details about your draft model training setup (hyper-params, train dataset, duration, any useful tips).
Particularly, sharing the sharegpt_train.jsonand sharegpt_test.json files would be very helpful, as no support for creating your training dataset is there in the codebase.

Failure details

  • Codebase used: https://anonymous.4open.science/r/SpecExit-B802
  • Running inference with gen_ea_answer.py
  • Benchmark: GSM8K
  • Base model: Qwen/Qwen3-8B
  • Attempted draft model: AngelSlim/Qwen3-4B_eagle3
Dhia-GB changed pull request status to open
Ready to merge
This branch is ready to get merged automatically.

Sign up or log in to comment