wangyiqun
/

Phi-3-mini-4k-instruct-awq

4-bit precision

Model card Files Files and versions

wangyiqun commited on Apr 15, 2025

Commit

c05b3cf

·

verified ·

1 Parent(s): e329503

Update README.md

Files changed (1) hide show

README.md +8 -1

README.md CHANGED Viewed

@@ -1,3 +1,10 @@
 ### Phi-3 Large Model and AWQ Quantization Principle
 #### 1. Introduction to the Phi-3 Large Model 🤖
@@ -89,4 +96,4 @@ generation_output = model.generate(
    - `max_new_tokens` replaces the traditional `max_seq_len`, clearly controlling the number of newly generated tokens and avoiding limitations affected by the input length 📏.
    - `temperature` and `top_p` adjust the output diversity, suitable for open-domain generation tasks (such as creative writing); if deterministic output is required (such as question answering), it can be set to `temperature=0.0` 🎨🔢.
-Through the above code, the Phi-3 model can be efficiently run in an environment with limited resources, and the AWQ quantization technology can be used to achieve low-cost and high-speed text generation 🚀.

+---
+license: apache-2.0
+language:
+- en
+base_model:
+- microsoft/Phi-4-mini-instruct
+---
 ### Phi-3 Large Model and AWQ Quantization Principle
 #### 1. Introduction to the Phi-3 Large Model 🤖
    - `max_new_tokens` replaces the traditional `max_seq_len`, clearly controlling the number of newly generated tokens and avoiding limitations affected by the input length 📏.
    - `temperature` and `top_p` adjust the output diversity, suitable for open-domain generation tasks (such as creative writing); if deterministic output is required (such as question answering), it can be set to `temperature=0.0` 🎨🔢.
+Through the above code, the Phi-3 model can be efficiently run in an environment with limited resources, and the AWQ quantization technology can be used to achieve low-cost and high-speed text generation 🚀.