Instructions to use SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m") model = AutoModelForCausalLM.from_pretrained("SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m
- SGLang
How to use SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m with Docker Model Runner:
docker model run hf.co/SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m
这是一个基于 Qwen/Qwen3-0.6B-Base 进行指令微调的语言模型。
| Step | Eval Loss | 下降幅度 (Change) |
|---|---|---|
| 2000 | 1.100951 | 0.000000 |
| 4000 | 1.065262 | -0.035689 |
| 6000 | 1.044676 | -0.020586 |
| 8000 | 1.030341 | -0.014335 |
| 10000 | 1.020166 | -0.010175 |
| 12000 | 1.012897 | -0.007268 |
| 14000 | 1.006105 | -0.006792 |
| 16000 | 1.002148 | -0.003957 |
| 18000 | 0.996928 | -0.005221 |
| 20000 | 0.992385 | -0.004543 |
| 22000 | 0.989214 | -0.003170 |
| 24000 | 0.986078 | -0.003136 |
| 26000 | 0.982033 | -0.004045 |
| 28000 | 0.979861 | -0.002172 |
| 30000 | 0.976318 | -0.003543 |
| 32000 | 0.974872 | -0.001446 |
| 34000 | 0.972485 | -0.002387 |
| 36000 | 0.970818 | -0.001667 |
| 38000 | 0.968975 | -0.001842 |
| 40000 | 0.966464 | -0.002511 |
| 42000 | 0.963575 | -0.002889 |
| 44000 | 0.963056 | -0.000518 |
| 46000 | 0.960805 | -0.002251 |
| 48000 | 0.957807 | -0.002998 |
这是模型的第二个测试版本,提供以下三种功能:
- 自然语言转标签 (NL to Tag)
- 标签转自然语言 (Tag to NL)
- 标签补全 (Tag to Tag)
跟第一个版本的区别在于,这个使用了300万数据集,64dim,其余几乎相同。
相同测试集下,与第一个版本相比 eval_loss=1.02→0.957807
模型详情
- 基础模型:
Qwen/Qwen3-0.6B-Base - 微调方法: 指令微调 (Instruction Fine-tuning)
- 训练数据: 模型使用了约 300 万条数据进行训练。数据集包含三个指令任务,清洗筛选后每个任务约有 92 万条训练样本。
如何使用
请使用特定的指令 Token 来引导模型执行相应任务。输入和输出需要使用指定的 XML 格式包裹。
1. 自然语言描述 → Danbooru 标签 (<NLTOTAG>)
- 指令:
<NLTOTAG> - 输入:
<caption>This illustration features a young girl standing with a smile, looking directly at the viewer. She's wearing a white beret, and her long, light-colored hair is styled with pigtails, adorned with ribbons. She sports round glasses and a white collared shirt with long sleeves, layered with a light orange sweater vest. A pleated skirt that matches the ribbons, and thigh-highs complete her outfit. Her hands are clasped over her chest, possibly holding a book or other object. The background is plain white, emphasizing the character.</caption> - 输出:
<tags><special>1girl</special><artists></artists><characters></characters><copyrights>original</copyrights><general>skirt, thighhighs, pleated_skirt, smile, white_background, hands_on_own_chest, looking_at_viewer, long_hair, glasses, solo, simple_background, long_sleeves, sketch</general><rating>safe</rating></tags>
2. Danbooru 标签 → 自然语言描述 (<TAGTONL>)
- 指令:
<TAGTONL> - 输入:
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>bracelet, casual, purple_hair, oekaki, outdoors, white_flower, blouse, jewelry, pendant, drill_hair, dandelion, closed_eyes, smile, short_sleeves, frills</general><rating>safe</rating></tags> - 输出:
<caption>The image is a vibrant, anime-style illustration of two young women, likely teens, lying together on a bed of green grass dotted with dandelions. The composition features a close-up, overhead perspective. The first girl has brown drill hair styled in two high twintails, secured with pink bows featuring black polka dots, and she's dressed in a polka-dotted dress with frilly details and short sleeves, with a pink heart pendant necklace. She's smiling widely, eyes closed, showing a blush on her cheeks, and is looking at the other girl. The second girl, also wearing a similar style dress with a pink bow, and has pink hair, looks back at her with a smile, hugging her. Both girls have bracelets. The overall atmosphere is bright and cheerful, suggesting a moment of friendship and intimacy in a sunny outdoor setting.</caption>
3. 标签补全与优化 (<TAGTOTAG>)
- 指令:
<TAGTOTAG> - 输入:
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>looking_at_another, hug, camisole, on_back, blush, oekaki, field, jewelry, dandelion_clock, on_grass, pendant, blouse, short_sleeves, dandelion, frills, bow, casual, smile, sleeveless, outdoors, brown_hair, pink_bow, hair_ribbon, polka_dot, shirt, short_hair, yellow_flower, lying, flower, closed_eyes, bracelet, drill_hair, sparkle, grass, on_side, purple_hair, ribbon, on_ground, white_flower</general><rating>safe</rating></tags> - 输出:
<tags><special>2girls</special><artists></artists><characters></characters><copyrights></copyrights><general>closed_eyes, hair_ribbon, oekaki, sleeveless, sparkle, hug, pink_bow, white_flower, short_hair, looking_at_another, dandelion_clock, ribbon, pendant, flower, lying, purple_hair, bracelet, smile, bow, brown_hair, frills, blush, jewelry, short_sleeves, on_grass, casual, grass, outdoors, shirt, blouse, field, yellow_flower, camisole, on_back, twintails, polka_dot, on_ground, on_side, dandelion</general><rating>safe</rating></tags>
已知问题
<TAGTOTAG>的数据集构建。 可能把输入(需要补全的 tags)和输出(补全后的 tags)都随机 drop 掉了一部分。理论上只应该 drop 输入,这导致模型这边奇怪。- 短样本筛选不力。 没有把特别短的 tag 样本过滤。
未来计划
- 重新处理数据集: 换成10% dan和full dan进行训练。
- 模型参数调优: dim拉高/换1.5b的qwen
引用
没有引用,孩子不懂事炼着玩的
- Downloads last month
- -
Model tree for SakikoLab/Qwen3-0.6B-Prompt-Gen-beta-3m
Base model
Qwen/Qwen3-0.6B-Base