Update README.md
Browse files
README.md
CHANGED
|
@@ -1,3 +1,10 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
### Phi-3 Large Model and AWQ Quantization Principle
|
| 2 |
|
| 3 |
#### 1. Introduction to the Phi-3 Large Model π€
|
|
@@ -89,4 +96,4 @@ generation_output = model.generate(
|
|
| 89 |
- `max_new_tokens` replaces the traditional `max_seq_len`, clearly controlling the number of newly generated tokens and avoiding limitations affected by the input length π.
|
| 90 |
- `temperature` and `top_p` adjust the output diversity, suitable for open-domain generation tasks (such as creative writing); if deterministic output is required (such as question answering), it can be set to `temperature=0.0` π¨π’.
|
| 91 |
|
| 92 |
-
Through the above code, the Phi-3 model can be efficiently run in an environment with limited resources, and the AWQ quantization technology can be used to achieve low-cost and high-speed text generation π.
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
base_model:
|
| 6 |
+
- microsoft/Phi-4-mini-instruct
|
| 7 |
+
---
|
| 8 |
### Phi-3 Large Model and AWQ Quantization Principle
|
| 9 |
|
| 10 |
#### 1. Introduction to the Phi-3 Large Model π€
|
|
|
|
| 96 |
- `max_new_tokens` replaces the traditional `max_seq_len`, clearly controlling the number of newly generated tokens and avoiding limitations affected by the input length π.
|
| 97 |
- `temperature` and `top_p` adjust the output diversity, suitable for open-domain generation tasks (such as creative writing); if deterministic output is required (such as question answering), it can be set to `temperature=0.0` π¨π’.
|
| 98 |
|
| 99 |
+
Through the above code, the Phi-3 model can be efficiently run in an environment with limited resources, and the AWQ quantization technology can be used to achieve low-cost and high-speed text generation π.
|