bartowski commited on
Commit
aab43ae
·
verified ·
1 Parent(s): 9f00b00

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -0
README.md CHANGED
@@ -10,6 +10,10 @@ quantized_by: bartowski
10
 
11
  <b>Yes, this is with the fix to the tokenizer!</b>
12
 
 
 
 
 
13
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3658">b3658</a> for quantization.
14
 
15
  Original model: https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B
 
10
 
11
  <b>Yes, this is with the fix to the tokenizer!</b>
12
 
13
+ If you want to make sure it's using the thought and output tokens, be sure to enable rendering of special tokens (in llama.cpp this is the `--special` tag)
14
+
15
+ It is able to use them without rendering them, much like chat tokens, this will just let you *see* them as they're getting used by the model.
16
+
17
  Using <a href="https://github.com/ggerganov/llama.cpp/">llama.cpp</a> release <a href="https://github.com/ggerganov/llama.cpp/releases/tag/b3658">b3658</a> for quantization.
18
 
19
  Original model: https://huggingface.co/mattshumer/Reflection-Llama-3.1-70B