-
-
-
-
-
-
Inference Providers
Active filters: int4
2imi9/Qwen3-1.7b-gptq-int4
Text Generation
• 0.9B • Updated
• 2
Dhruvil03/Perception-LM-1B-Int4bit
Image-Text-to-Text
• 2B • Updated
• 7
• 1
RiverkanIT/Ling-mini-2.0-Quantized
Text Generation
• Updated
• 2
ForeseeLab/foreseeai-qwen3-4b-iot-int4
Text Generation
• 4B • Updated
• 1
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-nvfp
Image-Text-to-Text
• 5B • Updated
• 62
ISTA-DASLab/Llama-3.1-8B-Instruct-MR-GPTQ-mxfp
Image-Text-to-Text
• 5B • Updated
• 12
huawei-csl/Qwen3-1.7B-4bit-SINQ
Text Generation
• 1B • Updated
• 5
• 5
huawei-csl/Qwen3-1.7B-4bit-ASINQ
Text Generation
• 1B • Updated
• 3
• 5
huawei-csl/Qwen3-32B-4bit-SINQ
Text Generation
• 18B • Updated
• 11
• 7
huawei-csl/Qwen3-14B-4bit-SINQ
Text Generation
• 9B • Updated
• 7
• 5
huawei-csl/Qwen3-14B-4bit-ASINQ
Text Generation
• 9B • Updated
• 3
• 6
huawei-csl/Qwen3-32B-4bit-ASINQ
Text Generation
• 18B • Updated
• 4
• 8
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v1
Text Generation
• 357B • Updated
• 1
ModelCloud/GLM-4.6-GPTQMODEL-W4A16-v2
Text Generation
• 357B • Updated
• 3
• 1
PangaiaSoftware/YanoljaNEXT-Rosetta-4B-onnx
Translation
• Updated
• 1
• 2
RedHatAI/NVIDIA-Nemotron-Nano-9B-v2-quantized.w4a16
Text Generation
• 2B • Updated
• 245
• 5
ModelCloud/GLM-4.6-REAP-268B-A32B-GPTQMODEL-W4A16
Text Generation
• 269B • Updated
• 3
• 2
AhtnaGlen/phi-4-mini-instruct-int4-sym-npu-ov
Text Generation
• Updated
• 3
tencent/DeepSeek-V3.1-Terminus-W4AFP8
Text Generation
• 349B • Updated
• 1.66k
• 15
ModelCloud/MiniMax-M2-GPTQMODEL-W4A16
Text Generation
• 229B • Updated
• 59
• 3
ModelCloud/Marin-32B-Base-GPTQMODEL-W4A16
Text Generation
• 33B • Updated
• 1
• 1
ModelCloud/Marin-32B-Base-GPTQMODEL-AWQ-W4A16
Text Generation
• 33B • Updated
• 10
• 2
huawei-csl/Apertus-8B-2509-4bit-SINQ
Text Generation
• 5B • Updated
• 2
huawei-csl/Apertus-8B-2509-4bit-ASINQ
Text Generation
• 5B • Updated
• 5
• 2
ModelCloud/Granite-4.0-H-1B-GPTQMODEL-W4A16
Text Generation
• 1B • Updated
• 10
• 1
ModelCloud/Granite-4.0-H-350M-GPTQMODEL-W4A16
Text Generation
• 0.3B • Updated
• 25
• 1
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16
Text Generation
• 15B • Updated
• 5
• 1
ModelCloud/Brumby-14B-Base-GPTQMODEL-W4A16-v2
Text Generation
• 15B • Updated
• 43
• 1
SherlockID365/Qwen3-VL-8B-Instruct-quantized.w4a16
Image-Text-to-Text
• 3B • Updated
• 46
• 1
Ishant86/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-compressed-tensors-int4
6B • Updated