Update README.md
Browse files
README.md
CHANGED
|
@@ -24,7 +24,6 @@ The Llama-2 7B Chat GGUF model is an instance of the Llama-2 architecture, devel
|
|
| 24 |
## Files and Versions
|
| 25 |
|
| 26 |
- **llama-2-7b-chat.GGUF.q4_0.bin**: 4-bit quantized model (3.6 GB)
|
| 27 |
-
- **llama-2-7b-chat.GGUF.q5_0.bin**: 5-bit quantized model (4.4 GB)
|
| 28 |
- **llama-2-7b-chat.GGUF.q8_0.bin**: 8-bit quantized model (6.7 GB)
|
| 29 |
|
| 30 |
The model has been converted and quantized using the GGUF format. Conversion was performed using Georgi Gerganov's llama.cpp library, and quantization was accomplished using the llama-cpp-python tool created by Andrei Betlen.
|
|
|
|
| 24 |
## Files and Versions
|
| 25 |
|
| 26 |
- **llama-2-7b-chat.GGUF.q4_0.bin**: 4-bit quantized model (3.6 GB)
|
|
|
|
| 27 |
- **llama-2-7b-chat.GGUF.q8_0.bin**: 8-bit quantized model (6.7 GB)
|
| 28 |
|
| 29 |
The model has been converted and quantized using the GGUF format. Conversion was performed using Georgi Gerganov's llama.cpp library, and quantization was accomplished using the llama-cpp-python tool created by Andrei Betlen.
|