Update README
Browse files
README.md
CHANGED
@@ -26,10 +26,10 @@ At its core, an Importance Matrix (imatrix) is a table or, more broadly, a struc
|
|
26 |
The process to produce the quantized [GGUF](https://huggingface.co/docs/hub/en/gguf) models is roughly as follows:
|
27 |
|
28 |
1. Convert the the original model's safetensors into GGUF F16*
|
29 |
-
2. Estimate the Perplexity score for the F16 model (base) using [wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1), and record the [logits](
|
30 |
-
3. Generate the [imatrix](
|
31 |
4. Create quantized versions of the base model using each imatrix per quant type
|
32 |
-
5. Calculate the Perplexity and KL Divergence scores for each quantized model [(logs)](
|
33 |
6. For each quant type, keep the version with the best (usually the lowest) scores
|
34 |
|
35 |
*[BF16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) would be preferred, but Apple's GPUs don't support it yet, and therefore any operations are executed in the CPU, making it unacceptably slow. This is expected to change in the near term but until then, if you are using Apple kit avoid using any models tagged BF16
|
|
|
26 |
The process to produce the quantized [GGUF](https://huggingface.co/docs/hub/en/gguf) models is roughly as follows:
|
27 |
|
28 |
1. Convert the the original model's safetensors into GGUF F16*
|
29 |
+
2. Estimate the Perplexity score for the F16 model (base) using [wikitext-2-raw-v1](https://huggingface.co/datasets/Salesforce/wikitext/tree/main/wikitext-2-raw-v1), and record the [logits](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/logits)
|
30 |
+
3. Generate the [imatrix](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/imatrix) for each calibration dataset
|
31 |
4. Create quantized versions of the base model using each imatrix per quant type
|
32 |
+
5. Calculate the Perplexity and KL Divergence scores for each quantized model [(logs)](https://huggingface.co/eaddario/Watt-Tool-8B-GGUF/tree/main/scores)
|
33 |
6. For each quant type, keep the version with the best (usually the lowest) scores
|
34 |
|
35 |
*[BF16](https://en.wikipedia.org/wiki/Bfloat16_floating-point_format) would be preferred, but Apple's GPUs don't support it yet, and therefore any operations are executed in the CPU, making it unacceptably slow. This is expected to change in the near term but until then, if you are using Apple kit avoid using any models tagged BF16
|