W4A16 quant
#1
by timroethig - opened
Thanks for providing these quants, are you by chance also working on a W4A16 quant? Could make a lot of sense for a sparse MoE model like this no?
yes, other quant schemes (int4 and fp4) are coming very soon
thanks for your work.
hopefully i can run it on two h100 :)
thanks for your work.
hopefully i can run it on two h100 :)
maybe you need gguf q3 or autoround w2a16🥲
yes, other quant schemes (int4 and fp4) are coming very soon
I'm very much looking forward to your INT4(w4a16) quantized model!
Or do you have a quantization Python script for Qwen3.5-397B that we can try ourselves?
Any progress?