W4A16 quant

#1
by timroethig - opened

Thanks for providing these quants, are you by chance also working on a W4A16 quant? Could make a lot of sense for a sparse MoE model like this no?

Red Hat AI org

yes, other quant schemes (int4 and fp4) are coming very soon

thanks for your work.
hopefully i can run it on two h100 :)

thanks for your work.
hopefully i can run it on two h100 :)

maybe you need gguf q3 or autoround w2a16🥲

yes, other quant schemes (int4 and fp4) are coming very soon

I'm very much looking forward to your INT4(w4a16) quantized model!

Or do you have a quantization Python script for Qwen3.5-397B that we can try ourselves?

Any progress?

Sign up or log in to comment