SpecExit ready-for-benchmarking 'draft model'
Hi,
Thank you for the great work on SpecExit! I'm trying to benchmark the method using Qwen3-8B (and Qwen3-14B) as the base model.
Issue
I attempted to use a standard EAGLE3 draft model (AngelSlim/Qwen3-4B_eagle3) with the provided inference code, but encountered a weight shape mismatch:
RuntimeError: Error(s) in loading state_dict for Model: size mismatch for fc.weight: copying a param with shape torch.Size([2560, 7680]) from checkpoint, the shape in current model is torch.Size([2563, 7680]).
This is expected since standard EAGLE3 models don't include the +3 outputs for the CPR (Confidence-Progress-Remain) prediction head required by SpecExit.
Request
Would it be possible to release pre-trained SpecExit draft models that include the CPR head? Specifically:
- Qwen3-based models (e.g., Qwen3-8B + compatible draft model)
- Any models used in the paper's benchmarks (for reproduction purposes)
Alternatively, if releasing full models isn't feasible, could you provide details about your draft model training setup (hyper-params, train dataset, duration, any useful tips).
Particularly, sharing the sharegpt_train.jsonand sharegpt_test.json files would be very helpful, as no support for creating your training dataset is there in the codebase.
Failure details
- Codebase used:
https://anonymous.4open.science/r/SpecExit-B802 - Running inference with
gen_ea_answer.py - Benchmark: GSM8K
- Base model:
Qwen/Qwen3-8B - Attempted draft model:
AngelSlim/Qwen3-4B_eagle3