Update README.md
Browse files
README.md
CHANGED
@@ -10,6 +10,10 @@ quantized_by: bartowski
|
|
10 |
|
11 |
<b>Yes, this is with the fix to the tokenizer!</b>
|
12 |
|
|
|
|
|
|
|
|
|
13 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3658">b3658</a> for quantization.
|
14 |
|
15 |
Original model: https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
|
|
|
10 |
|
11 |
<b>Yes, this is with the fix to the tokenizer!</b>
|
12 |
|
13 |
+
If you want to make sure it's using the thought and output tokens, be sure to enable rendering of special tokens (in llama.cpp this is the `--special` tag)
|
14 |
+
|
15 |
+
It is able to use them without rendering them, much like chat tokens, this will just let you *see* them as they're getting used by the model.
|
16 |
+
|
17 |
Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3658">b3658</a> for quantization.
|
18 |
|
19 |
Original model: https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
|