Update README.md
Browse files
README.md
CHANGED
@@ -306,8 +306,21 @@ Third Eye Blind remains a beloved rock band with a dedicated fan base. Their mus
|
|
306 |
| [GGUF](https://huggingface.co/mradermacher/Llama-3.1-SISaAI-Ko-merge-8B-Instruct-GGUF/resolve/main/Llama-3.1-SISaAI-Ko-merge-8B-Instruct.Q8_0.gguf) | Q8_0 | 8.6 | fast, best quality |
|
307 |
| [GGUF](https://huggingface.co/mradermacher/Llama-3.1-SISaAI-Ko-merge-8B-Instruct-GGUF/resolve/main/Llama-3.1-SISaAI-Ko-merge-8B-Instruct.f16.gguf) | f16 | 16.2 | 16 bpw, overkill |
|
308 |
|
309 |
-
|
310 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
311 |
|
312 |
![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
|
313 |
|
|
|
306 |
| [GGUF](https://huggingface.co/mradermacher/Llama-3.1-SISaAI-Ko-merge-8B-Instruct-GGUF/resolve/main/Llama-3.1-SISaAI-Ko-merge-8B-Instruct.Q8_0.gguf) | Q8_0 | 8.6 | fast, best quality |
|
307 |
| [GGUF](https://huggingface.co/mradermacher/Llama-3.1-SISaAI-Ko-merge-8B-Instruct-GGUF/resolve/main/Llama-3.1-SISaAI-Ko-merge-8B-Instruct.f16.gguf) | f16 | 16.2 | 16 bpw, overkill |
|
308 |
|
309 |
+
This graph compares the performance of various quantization methods, focusing on lower-quality quant types:
|
310 |
+
|
311 |
+
X-axis (bpw): Bits per weight. Lower values mean higher compression.
|
312 |
+
|
313 |
+
Y-axis (PPL(Q)/PPL(fp16)-1): Performance degradation of quantized models. Lower values mean less degradation.
|
314 |
+
|
315 |
+
Methods Compared:
|
316 |
+
|
317 |
+
Pre-imatrix k-quants: Older quantization methods.
|
318 |
+
|
319 |
+
Pre-imatrix legacy quants: Traditional quantization methods.
|
320 |
+
|
321 |
+
imatrix i- and k-quants: Modern quantization using advanced techniques.
|
322 |
+
|
323 |
+
Key Insight: imatrix-based methods (i- and k-quants) show less performance degradation, especially at lower bpw (higher compression), making them more efficient than legacy methods. The graph helps choose the best method based on the desired balance between compression and performance.
|
324 |
|
325 |
![image.png](https://www.nethype.de/huggingface_embed/quantpplgraph.png)
|
326 |
|