Files changed (1) hide show
  1. README.md +52 -0
README.md ADDED
@@ -0,0 +1,52 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ base_model: Qwen/Qwen3-0.6B
4
+ pipeline_tag: text-generation
5
+ ---
6
+
7
+ # NexaAI/Qwen3-0.6B
8
+
9
+ ## Quickstart
10
+
11
+ Run them directly with [nexa-sdk](https://github.com/NexaAI/nexa-sdk) installed
12
+ In nexa-sdk CLI:
13
+
14
+ ```bash
15
+ nexa infer NexaAI/Qwen3-0.6B
16
+ ```
17
+
18
+ #### Available Quantizations
19
+ | Filename | Quant type | File Size | Split | Description |
20
+ | -------- | ---------- | --------- | ----- | ----------- |
21
+ | [Qwen3-0.6B-Q8_0.gguf](https://huggingface.co/NexaAI/Qwen3-0.6B/blob/main/Qwen3-0.6B-Q8_0.gguf) | Q8_0 | 805 MB | false | High quality 8-bit quantization. Recommended for efficient inference. |
22
+ | [Qwen3-0.6B-f16.gguf](https://huggingface.co/NexaAI/Qwen3-0.6B/blob/main/Qwen3-0.6B-f16.gguf) | f16 | 1.51 GB | false | Half-precision (FP16) format. Better accuracy, requires more memory. |
23
+
24
+ ## Overview
25
+
26
+ Qwen3 is the latest generation of large language models in Qwen series, offering a comprehensive suite of dense and mixture-of-experts (MoE) models. Built upon extensive training, Qwen3 delivers groundbreaking advancements in reasoning, instruction-following, agent capabilities, and multilingual support, with the following key features:
27
+
28
+ - **Uniquely support of seamless switching between thinking mode** (for complex logical reasoning, math, and coding) and **non-thinking mode** (for efficient, general-purpose dialogue) **within single model**, ensuring optimal performance across various scenarios.
29
+ - **Significantly enhancement in its reasoning capabilities**, surpassing previous QwQ (in thinking mode) and Qwen2.5 instruct models (in non-thinking mode) on mathematics, code generation, and commonsense logical reasoning.
30
+ - **Superior human preference alignment**, excelling in creative writing, role-playing, multi-turn dialogues, and instruction following, to deliver a more natural, engaging, and immersive conversational experience.
31
+ - **Expertise in agent capabilities**, enabling precise integration with external tools in both thinking and unthinking modes and achieving leading performance among open-source models in complex agent-based tasks.
32
+ - **Support of 100+ languages and dialects** with strong capabilities for **multilingual instruction following** and **translation**.
33
+
34
+ #### Model Overview
35
+
36
+ **Qwen3-0.6B** has the following features:
37
+ - Type: Causal Language Models
38
+ - Training Stage: Pretraining & Post-training
39
+ - Number of Parameters: 0.6B
40
+ - Number of Paramaters (Non-Embedding): 0.44B
41
+ - Number of Layers: 28
42
+ - Number of Attention Heads (GQA): 16 for Q and 8 for KV
43
+ - Context Length: 32,768
44
+
45
+ For more details, including benchmark evaluation, hardware requirements, and inference performance, please refer to our [blog](https://qwenlm.github.io/blog/qwen3/), [GitHub](https://github.com/QwenLM/Qwen3), and [Documentation](https://qwen.readthedocs.io/en/latest/).
46
+
47
+
48
+ ## Benchmark Results
49
+
50
+
51
+ ## Reference
52
+ **Original model card**: [Qwen/Qwen3-0.6B](https://huggingface.co/Qwen/Qwen3-0.6B)