Dolphin3.0-R1-Mistral-24B-GGUF / scores /Dolphin3.0-R1-Mistral-24B-Q8_0.log
eaddario's picture
Generate perplexity and kld scores
1f75c4c
====== Perplexity statistics ======
Mean PPL(Q) : 23.519878 ± 0.223215
Mean PPL(base) : 23.352232 ± 0.220841
Cor(ln(PPL(Q)), ln(PPL(base))): 99.96%
Mean ln(PPL(Q)/PPL(base)) : 0.007153 ± 0.000271
Mean PPL(Q)/PPL(base) : 1.007179 ± 0.000273
Mean PPL(Q)-PPL(base) : 0.167646 ± 0.006738
====== KL divergence statistics ======
Mean KLD: 0.002318 ± 0.000013
Maximum KLD: 0.848010
99.9% KLD: 0.043823
99.0% KLD: 0.015801
99.0% KLD: 0.015801
Median KLD: 0.001166
10.0% KLD: 0.000025
5.0% KLD: -0.000001
1.0% KLD: -0.000063
Minimum KLD: -0.000401
====== Token probability statistics ======
Mean Δp: 0.006 ± 0.003 %
Maximum Δp: 26.906%
99.9% Δp: 7.636%
99.0% Δp: 4.056%
95.0% Δp: 1.919%
90.0% Δp: 0.994%
75.0% Δp: 0.127%
Median Δp: -0.000%
25.0% Δp: -0.119%
10.0% Δp: -0.958%
5.0% Δp: -1.871%
1.0% Δp: -4.090%
0.1% Δp: -7.817%
Minimum Δp: -20.640%
RMS Δp : 1.266 ± 0.008 %
Same top p: 97.778 ± 0.038 %
llama_perf_context_print: load time = 3238.55 ms
llama_perf_context_print: prompt eval time = 2842839.16 ms / 304128 tokens ( 9.35 ms per token, 106.98 tokens per second)
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
llama_perf_context_print: total time = 2975924.16 ms / 304129 tokens