Instructions to use M-Alkassem/qwen2.5-coder-3b-final-merged with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use M-Alkassem/qwen2.5-coder-3b-final-merged with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="M-Alkassem/qwen2.5-coder-3b-final-merged") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("M-Alkassem/qwen2.5-coder-3b-final-merged") model = AutoModelForCausalLM.from_pretrained("M-Alkassem/qwen2.5-coder-3b-final-merged") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use M-Alkassem/qwen2.5-coder-3b-final-merged with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "M-Alkassem/qwen2.5-coder-3b-final-merged" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "M-Alkassem/qwen2.5-coder-3b-final-merged", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/M-Alkassem/qwen2.5-coder-3b-final-merged
- SGLang
How to use M-Alkassem/qwen2.5-coder-3b-final-merged with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "M-Alkassem/qwen2.5-coder-3b-final-merged" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "M-Alkassem/qwen2.5-coder-3b-final-merged", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "M-Alkassem/qwen2.5-coder-3b-final-merged" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "M-Alkassem/qwen2.5-coder-3b-final-merged", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use M-Alkassem/qwen2.5-coder-3b-final-merged with Docker Model Runner:
docker model run hf.co/M-Alkassem/qwen2.5-coder-3b-final-merged
qwen2.5-coder-3b-final-merged
This repository contains the final standalone merged model for the project.
It was created by merging:
- base model:
Qwen/Qwen2.5-Coder-3B-Instruct - final adapter:
M-Alkassem/qwen2.5-coder-3b-agent-v1
What This Model Is
This is the final merged result of a two-stage low-resource adaptation pipeline built on Google Colab using T4 GPU.
Project stages:
- coding-focused fine-tuning
- agent-oriented continued fine-tuning
- final merge into one standalone model
The final agent adapter was trained by continuing from the coding adapter, so this merged model represents the latest learned state after both fine-tuning stages.
Training Background
Stage 1: Coding Fine-Tune
Dataset:
bigcode/self-oss-instruct-sc2-exec-filter-50k
Setup:
- sampled rows before filtering:
4000 - rows used after filtering:
3993 - max sequence length:
1024 - training steps:
250
Result:
- final training loss: about
0.6130
Stage 2: Agent-Oriented Continued Fine-Tune
Dataset:
ernie-research/MEnvData-SWE-Trajectory
Setup:
- sampled rows:
700 - max sequence length:
1024 - training steps:
150
Result:
- final training loss: about
1.2940
Evaluation Notes
In the direct-answer benchmark, the original base model remained the strongest plain answer-only model overall.
The main value of this final merged model is different:
- it is the final standalone artifact of the project
- it is more aligned to constrained tool-using workflows
- it performed best when used as the reasoning core of a lightweight coding agent
The benchmark summary image above shows the plain prompting comparison:
- Base model overall mean:
3.97 - Coding adapter overall mean:
2.97 - Agent adapter overall mean:
1.77
The agent workflow image shows the documented agent_v2 result where the model:
- ran failing tests
- identified a bug
- rewrote code
- reran tests
- stopped after success
How To Load
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
MODEL_ID = "M-Alkassem/qwen2.5-coder-3b-final-merged"
tokenizer = AutoTokenizer.from_pretrained(MODEL_ID, use_fast=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
MODEL_ID,
torch_dtype=torch.float16,
device_map="auto",
)
model.eval()
Related Repositories
- coding adapter:
M-Alkassem/qwen2.5-coder-3b-unsloth-lora - agent adapter:
M-Alkassem/qwen2.5-coder-3b-agent-v1
References
- Base model: https://huggingface.co/Qwen/Qwen2.5-Coder-3B-Instruct
- Coding adapter: https://huggingface.co/M-Alkassem/qwen2.5-coder-3b-unsloth-lora
- Agent adapter: https://huggingface.co/M-Alkassem/qwen2.5-coder-3b-agent-v1
- Coding dataset: https://huggingface.co/datasets/bigcode/self-oss-instruct-sc2-exec-filter-50k
- Agent dataset: https://huggingface.co/datasets/ernie-research/MEnvData-SWE-Trajectory
Citation
If you use this model, please cite:
@article{hui2024qwen2p5coder,
title={Qwen2.5-Coder Technical Report},
author={Hui, Binyuan and Yang, Jian and Cui, Zeyu and Yang, Jing and Liu, Dayiheng and Zhang, Liqun and Liu, Tianyang and Zhang, Jiawei and Yu, Bo and Lu, Kaican and others},
journal={arXiv preprint arXiv:2409.12186},
year={2024}
}
- Downloads last month
- 10

