Qwen-Coder-14b-Vulrepair

🧠 Model Overview

Qwen-Coder-14b-Vulrepair-GGUF is a quantized version of Qwen-Coder-14b-Vulrepair, optimized for efficient inference with reduced memory usage and faster runtime while preserving as much of the original model quality as possible.

This repository provides multiple quantized variants suitable for:

  • Local inference
  • Low-VRAM GPUs
  • CPU-only environments

πŸ”— Original Model


πŸ“¦ Quantization Details

  • Quantization method: GGUF
  • Quantization tool: llama.cpp
  • Precision: Mixed (2-8,bit depands in variant)
  • Activation aware: No (weight-only quantinization)
  • Group size: 256 (K-quant variants)

πŸ“¦ Available Quantized Files

Quant Format File Name Approx. Size VRAM / RAM Needed Notes
Q2_K qwen-coder-14b-vulrepair-q2_k.gguf ~5.7 GB ~7 GB Extreme compression; noticeable quality loss
Q3_K_S qwen-coder-14b-vulrepair-q3_k_s.gguf ~6.6 GB ~7.5 GB Smaller, faster, lower quality
Q3_K_M qwen-coder-14b-vulrepair-q3_k_m.gguf ~7.3 GB ~8.5 GB Better balance than Q3_K_S
Q3_K_L qwen-coder-14b-vulrepair-q3_k_l.gguf ~7.9 GB ~9.8 GB Highest-quality 3-bit variant
Q4_0 qwen-coder-14b-vulrepair-q4_0.gguf ~8.5 GB ~10.3 GB Legacy format; simpler quantization
Q4_K_S qwen-coder-14b-vulrepair-q4_k_s.gguf ~8.5 GB ~10.5 GB Smaller grouped 4-bit
Q4_K_M qwen-coder-14b-vulrepair-q4_k_m.gguf ~8.9 GB ~11 GB Recommended default
Q5_0 <qwen-coder-14b-vulrepair-q5_0.gguf ~10.3 GB ~12 GB Higher quality, larger size
Q5_K_S qwen-coder-14b-vulrepair-q5_k_s.gguf ~10.3 GB ~12.2 GB Efficient high-quality variant
Q5_K_M qwen-coder-14b-vulrepair-q5_K_M.gguf ~10.5 GB ~12.8 GB Near-FP16 quality
Q6_K qwen-coder-14b-vulrepair-q6_k.gguf ~12.1 GB ~14.5 GB Minimal quantization loss
Q8_0 qwen-coder-14b-vulrepair-q8_0.gguf ~15.7 GB ~16 GB Maximum quality; large memory

πŸ’‘ Recommendation: Start with Q4_K_M for the best quality-to-performance ratio.


πŸš€ Usage Example

llama.cpp

./main -m qwen-coder-14b-vulrepair-q4_k_m.gguf -p "Write a python function that checks if the given string is palindrome" -n 256

Python (llama-cpp-python)

from llama_cpp import Llama

llm = Llama(
    model_path="qwen-coder-14b-vulrepair-q4_k_m.gguf",
    n_ctx=4096,
    n_threads=8
)

print(llm("Your prompt here"))

πŸ™‹ Contact

Maintainer: M Mashhudur Rahim [XythicK]

Role:
Independent Machine Learning Researcher & Model Infrastructure Maintainer

(Focused on model quantization, optimization, and efficient deployment)

For issues, improvement requests, or additional quantization formats, please use the Hugging Face Discussions or Issues tab.

❀️ Acknowledgements

Thanks to the original model authors for their ongoing contributions to open AI research, and to Hugging Face and the open-source machine learning community for providing the tools and platforms that make efficient model sharing and deployment possible.

Downloads last month
1,404
GGUF
Model size
15B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

2-bit

3-bit

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for XythicK/qwen-coder-14b-vulrepair-GGUF

Quantized
(3)
this model