W4A16 quant

by timroethig - opened Feb 18

Feb 18

Thanks for providing these quants, are you by chance also working on a W4A16 quant? Could make a lot of sense for a sparse MoE model like this no?

Red Hat AI org Feb 18

yes, other quant schemes (int4 and fp4) are coming very soon

Feb 19

thanks for your work.
hopefully i can run it on two h100 :)

Feb 19

thanks for your work.
hopefully i can run it on two h100 :)

maybe you need gguf q3 or autoround w2a16🥲

Mar 2

•

yes, other quant schemes (int4 and fp4) are coming very soon

I'm very much looking forward to your INT4(w4a16) quantized model!

Or do you have a quantization Python script for Qwen3.5-397B that we can try ourselves?

Any progress?

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment