====== Perplexity statistics ====== Mean PPL(Q) : 23.315818 ± 0.220552 Mean PPL(base) : 23.352232 ± 0.220841 Cor(ln(PPL(Q)), ln(PPL(base))): 99.50% Mean ln(PPL(Q)/PPL(base)) : -0.001561 ± 0.000949 Mean PPL(Q)/PPL(base) : 0.998441 ± 0.000948 Mean PPL(Q)-PPL(base) : -0.036415 ± 0.022157 ====== KL divergence statistics ====== Mean KLD: 0.037680 ± 0.000153 Maximum KLD: 5.026946 99.9% KLD: 0.544697 99.0% KLD: 0.251806 99.0% KLD: 0.251806 Median KLD: 0.018222 10.0% KLD: 0.000471 5.0% KLD: 0.000107 1.0% KLD: -0.000012 Minimum KLD: -0.000554 ====== Token probability statistics ====== Mean Δp: 0.151 ± 0.012 % Maximum Δp: 80.430% 99.9% Δp: 29.435% 99.0% Δp: 16.228% 95.0% Δp: 7.580% 90.0% Δp: 4.031% 75.0% Δp: 0.584% Median Δp: 0.000% 25.0% Δp: -0.350% 10.0% Δp: -3.338% 5.0% Δp: -6.778% 1.0% Δp: -15.802% 0.1% Δp: -30.165% Minimum Δp: -63.405% RMS Δp : 4.833 ± 0.025 % Same top p: 91.410 ± 0.072 % llama_perf_context_print: load time = 103826.14 ms llama_perf_context_print: prompt eval time = 1941845.24 ms / 304128 tokens ( 6.38 ms per token, 156.62 tokens per second) llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second) llama_perf_context_print: total time = 2633698.11 ms / 304129 tokens