Instructions to use RISys-Lab/RedSage-Qwen3-8B-DPO with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use RISys-Lab/RedSage-Qwen3-8B-DPO with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="RISys-Lab/RedSage-Qwen3-8B-DPO") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("RISys-Lab/RedSage-Qwen3-8B-DPO") model = AutoModelForCausalLM.from_pretrained("RISys-Lab/RedSage-Qwen3-8B-DPO") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Inference
- Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use RISys-Lab/RedSage-Qwen3-8B-DPO with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "RISys-Lab/RedSage-Qwen3-8B-DPO" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RISys-Lab/RedSage-Qwen3-8B-DPO", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/RISys-Lab/RedSage-Qwen3-8B-DPO
- SGLang
How to use RISys-Lab/RedSage-Qwen3-8B-DPO with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "RISys-Lab/RedSage-Qwen3-8B-DPO" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RISys-Lab/RedSage-Qwen3-8B-DPO", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "RISys-Lab/RedSage-Qwen3-8B-DPO" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "RISys-Lab/RedSage-Qwen3-8B-DPO", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use RISys-Lab/RedSage-Qwen3-8B-DPO with Docker Model Runner:
docker model run hf.co/RISys-Lab/RedSage-Qwen3-8B-DPO
RedSage-Qwen3-8B-DPO
Model Summary
RedSage-Qwen3-8B-DPO is the final, aligned version of the RedSage cybersecurity LLM series developed by RISysLab. It represents the fourth and final stage of the RedSage training pipeline.
This model is fine-tuned from RedSage-Qwen3-8B-Ins using Direct Preference Optimization (DPO) on the AllenAI Tulu 3 Preference Mixture. This alignment stage significantly enhances the model's general reasoning capabilities and safety behaviors while maintaining the deep cybersecurity domain expertise acquired during previous stages.
- Developed by: RISysLab
- Repository: GitHub
- Base Model: RISys-Lab/RedSage-Qwen3-8B-Ins
- Paper: RedSage: A Cybersecurity Generalist LLM (arXiv)
Training Lineage
RedSage employs a multi-stage training pipeline. This model represents the output of Stage 4.
- Stage 1: Continual Pre-Training (CPT) -> RedSage-Qwen3-8B-CFW
- Stage 2: Targeted Pre-Training -> RedSage-Qwen3-8B-Base
- Stage 3: Supervised Fine-Tuning (SFT) -> RedSage-Qwen3-8B-Ins
- Stage 4: Direct Preference Optimization (DPO) ->
RedSage-Qwen3-8B-DPO(Current Model)- Data: Tulu 3 Preference Mixture
Dataset: Preference Alignment
The model was aligned using the following high-quality preference dataset to ensure robust instruction following and general reasoning:
- Dataset:
allenai/llama-3.1-tulu-3-8b-preference-mixture - Description: A comprehensive collection of preference data used to align the Tulu 3 models, focusing on helpfulness, factuality, and safety.
Performance & Evaluation
RedSage-Qwen3-8B-DPO achieves the best balance between specialized domain knowledge and general capability among all RedSage variants.
1. RedSage-Bench (0-shot)
| Category | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Macro Average | 81.85 | 84.83 |
| Knowledge (General) | 80.46 | 82.48 |
| Knowledge (Frameworks) | 78.82 | 83.80 |
| Skill (Offensive) | 86.16 | 88.54 |
| Tools (CLI) | 83.92 | 86.30 |
| Tools (Kali) | 75.56 | 79.30 |
2. External Cybersecurity Benchmarks (0-shot)
| Benchmark | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Mean | 75.71 | 81.10 |
| CTI-Bench (MCQ) | 62.76 | 70.84 |
| CTI-Bench (RCM) | 54.00 | 70.60 |
| CyberMetric (500) | 88.60 | 90.00 |
| MMLU (Security) | 76.00 | 79.00 |
| SecBench (En) | 73.26 | 80.06 |
| SecEva (MCQ) | 65.46 | 74.22 |
| SECURE (CWET) | 88.11 | 91.35 |
| SECURE (KCV) | 87.42 | 82.86 |
| SECURE (MEAT) | 85.75 | 91.00 |
3. OpenLLM Leaderboard (General Benchmark)
| Benchmark | Qwen3-8B (non-reasoning) | RedSage-8B-DPO |
|---|---|---|
| Mean | 65.92 | 74.33 |
| MMLU | 73.59 | 77.07 |
| ARC-C | 62.54 | 71.76 |
| GSM8K | 75.66 | 82.71 |
| HellaSwag | 56.70 | 79.87 |
| TruthfulQA | 45.23 | 52.47 |
| WinoGrande | 62.51 | 73.01 |
| IFEval | 85.21 | 83.44 |
Usage
Use the standard chat template for inference.
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch
model_id = "RISys-Lab/RedSage-Qwen3-8B-DPO"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
model_id,
torch_dtype=torch.bfloat16,
device_map="auto"
)
# Define the chat messages
messages = [
{"role": "system", "content": "You are RedSage, a helpful cybersecurity assistant."},
{"role": "user", "content": "Analyze the following log entry for potential indicators of compromise: 'POST /cgi-bin/test-cgi?* HTTP/1.1'"}
]
# Apply chat template
text = tokenizer.apply_chat_template(
messages,
tokenize=False,
add_generation_prompt=True
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Intended Use
- Primary Use: General-purpose cybersecurity assistance, log analysis, threat intelligence summarization, and educational queries.
- Benefits: Better instruction adherence based on human preference compared to the SFT-only version.
- Limitations: While aligned, the model may still produce incorrect information. Always verify outputs in critical security environments.
Citation
If you use this model or dataset, please cite our paper:
@inproceedings{suryanto2026redsage,
title={RedSage: A Cybersecurity Generalist {LLM}},
author={Naufal Suryanto and Muzammal Naseer and Pengfei Li and Syed Talal Wasim and Jinhui Yi and Juergen Gall and Paolo Ceravolo and Ernesto Damiani},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=W4FAenIrQ2}
}
- Downloads last month
- 1,696
Model tree for RISys-Lab/RedSage-Qwen3-8B-DPO
Base model
Qwen/Qwen3-8B-Base