====== Perplexity statistics ====== Mean PPL(Q) : 6.557455 ± 0.040211 Mean PPL(base) : 6.554978 ± 0.040159 Cor(ln(PPL(Q)), ln(PPL(base))): 99.96% Mean ln(PPL(Q)/PPL(base)) : 0.000378 ± 0.000173 Mean PPL(Q)/PPL(base) : 1.000378 ± 0.000173 Mean PPL(Q)-PPL(base) : 0.002477 ± 0.001132 ====== KL divergence statistics ====== Mean KLD: 0.001610 ± 0.000012 Maximum KLD: 1.216474 99.9% KLD: 0.035302 99.0% KLD: 0.010267 99.0% KLD: 0.010267 Median KLD: 0.001088 10.0% KLD: 0.000043 5.0% KLD: 0.000011 1.0% KLD: 0.000001 Minimum KLD: -0.000107 ====== Token probability statistics ====== Mean Δp: -0.013 ± 0.003 % Maximum Δp: 55.677% 99.9% Δp: 7.060% 99.0% Δp: 3.700% 95.0% Δp: 1.915% 90.0% Δp: 1.170% 75.0% Δp: 0.266% Median Δp: -0.000% 25.0% Δp: -0.285% 10.0% Δp: -1.201% 5.0% Δp: -1.965% 1.0% Δp: -3.786% 0.1% Δp: -7.655% Minimum Δp: -31.459% RMS Δp : 1.279 ± 0.012 % Same top p: 97.555 ± 0.041 %