====== Perplexity statistics ====== Mean PPL(Q) : 7.697080 ± 0.053150 Mean PPL(base) : 7.669212 ± 0.052592 Cor(ln(PPL(Q)), ln(PPL(base))): 99.90% Mean ln(PPL(Q)/PPL(base)) : 0.003627 ± 0.000311 Mean PPL(Q)/PPL(base) : 1.003634 ± 0.000312 Mean PPL(Q)-PPL(base) : 0.027868 ± 0.002426 ====== KL divergence statistics ====== Mean KLD: 0.003182 ± 0.000018 Maximum KLD: 1.478084 99.9% KLD: 0.065039 99.0% KLD: 0.023537 99.0% KLD: 0.023537 Median KLD: 0.001863 10.0% KLD: 0.000042 5.0% KLD: 0.000008 1.0% KLD: -0.000000 Minimum KLD: -0.000151 ====== Token probability statistics ====== Mean Δp: 0.077 ± 0.004 % Maximum Δp: 61.910% 99.9% Δp: 10.808% 99.0% Δp: 5.423% 95.0% Δp: 2.773% 90.0% Δp: 1.672% 75.0% Δp: 0.398% Median Δp: 0.001% 25.0% Δp: -0.249% 10.0% Δp: -1.415% 5.0% Δp: -2.457% 1.0% Δp: -5.115% 0.1% Δp: -10.383% Minimum Δp: -33.415% RMS Δp : 1.739 ± 0.012 % Same top p: 97.136 ± 0.043 %