====== Perplexity statistics ====== Mean PPL(Q) : 26.172864 ± 0.256006 Mean PPL(base) : 23.352232 ± 0.220841 Cor(ln(PPL(Q)), ln(PPL(base))): 99.01% Mean ln(PPL(Q)/PPL(base)) : 0.114031 ± 0.001391 Mean PPL(Q)/PPL(base) : 1.120786 ± 0.001559 Mean PPL(Q)-PPL(base) : 2.820631 ± 0.048526 ====== KL divergence statistics ====== Mean KLD: 0.078326 ± 0.000305 Maximum KLD: 7.825869 99.9% KLD: 1.127171 99.0% KLD: 0.520127 99.0% KLD: 0.520127 Median KLD: 0.038346 10.0% KLD: 0.000902 5.0% KLD: 0.000207 1.0% KLD: 0.000001 Minimum KLD: -0.000505 ====== Token probability statistics ====== Mean Δp: -0.107 ± 0.018 % Maximum Δp: 86.241% 99.9% Δp: 39.725% 99.0% Δp: 22.115% 95.0% Δp: 9.997% 90.0% Δp: 5.175% 75.0% Δp: 0.626% Median Δp: -0.000% 25.0% Δp: -0.677% 10.0% Δp: -5.458% 5.0% Δp: -10.553% 1.0% Δp: -23.853% 0.1% Δp: -43.880% Minimum Δp: -97.767% RMS Δp : 6.900 ± 0.035 % Same top p: 88.021 ± 0.083 % llama_perf_context_print: load time = 82098.99 ms llama_perf_context_print: prompt eval time = 1718829.57 ms / 304128 tokens ( 5.65 ms per token, 176.94 tokens per second) llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second) llama_perf_context_print: total time = 1775127.52 ms / 304129 tokens