====== Perplexity statistics ====== Mean PPL(Q) : 7.668204 ± 0.052714 Mean PPL(base) : 7.669212 ± 0.052592 Cor(ln(PPL(Q)), ln(PPL(base))): 99.94% Mean ln(PPL(Q)/PPL(base)) : -0.000131 ± 0.000238 Mean PPL(Q)/PPL(base) : 0.999869 ± 0.000238 Mean PPL(Q)-PPL(base) : -0.001008 ± 0.001825 ====== KL divergence statistics ====== Mean KLD: 0.001478 ± 0.000032 Maximum KLD: 4.769487 99.9% KLD: 0.028515 99.0% KLD: 0.009430 99.0% KLD: 0.009430 Median KLD: 0.000898 10.0% KLD: 0.000021 5.0% KLD: 0.000004 1.0% KLD: -0.000001 Minimum KLD: -0.000183 ====== Token probability statistics ====== Mean Δp: 0.006 ± 0.003 % Maximum Δp: 66.388% 99.9% Δp: 7.271% 99.0% Δp: 3.796% 95.0% Δp: 1.931% 90.0% Δp: 1.136% 75.0% Δp: 0.236% Median Δp: 0.000% 25.0% Δp: -0.229% 10.0% Δp: -1.136% 5.0% Δp: -1.916% 1.0% Δp: -3.721% 0.1% Δp: -6.644% Minimum Δp: -41.831% RMS Δp : 1.243 ± 0.014 % Same top p: 97.929 ± 0.037 %