====== Perplexity statistics ====== Mean PPL(Q) : 25.118649 ± 0.245079 Mean PPL(base) : 24.931431 ± 0.241228 Cor(ln(PPL(Q)), ln(PPL(base))): 99.91% Mean ln(PPL(Q)/PPL(base)) : 0.007481 ± 0.000431 Mean PPL(Q)/PPL(base) : 1.007509 ± 0.000434 Mean PPL(Q)-PPL(base) : 0.187218 ± 0.011259 ====== KL divergence statistics ====== Mean KLD: 0.000361 ± 0.000002 Maximum KLD: 0.215854 99.9% KLD: 0.004745 99.0% KLD: 0.002203 99.0% KLD: 0.002203 Median KLD: 0.000212 10.0% KLD: 0.000002 5.0% KLD: -0.000000 1.0% KLD: -0.000014 Minimum KLD: -0.000127 ====== Token probability statistics ====== Mean Δp: 0.003 ± 0.001 % Maximum Δp: 24.559% 99.9% Δp: 2.974% 99.0% Δp: 1.523% 95.0% Δp: 0.686% 90.0% Δp: 0.364% 75.0% Δp: 0.050% Median Δp: 0.000% 25.0% Δp: -0.047% 10.0% Δp: -0.356% 5.0% Δp: -0.670% 1.0% Δp: -1.484% 0.1% Δp: -2.907% Minimum Δp: -7.835% RMS Δp : 0.468 ± 0.005 % Same top p: 98.983 ± 0.026 %