====== Perplexity statistics ====== | |
Mean PPL(Q) : 39.396539 ± 0.409082 | |
Mean PPL(base) : 23.352232 ± 0.220841 | |
Cor(ln(PPL(Q)), ln(PPL(base))): 94.50% | |
Mean ln(PPL(Q)/PPL(base)) : 0.522985 ± 0.003415 | |
Mean PPL(Q)/PPL(base) : 1.687057 ± 0.005762 | |
Mean PPL(Q)-PPL(base) : 16.044307 ± 0.213016 | |
====== KL divergence statistics ====== | |
Mean KLD: 0.448856 ± 0.001660 | |
Maximum KLD: 11.228906 | |
99.9% KLD: 5.432167 | |
99.0% KLD: 3.097294 | |
99.0% KLD: 3.097294 | |
Median KLD: 0.213731 | |
10.0% KLD: 0.005922 | |
5.0% KLD: 0.001612 | |
1.0% KLD: 0.000166 | |
Minimum KLD: -0.000425 | |
====== Token probability statistics ====== | |
Mean Δp: -2.648 ± 0.043 % | |
Maximum Δp: 96.523% | |
99.9% Δp: 66.438% | |
99.0% Δp: 40.723% | |
95.0% Δp: 18.234% | |
90.0% Δp: 8.706% | |
75.0% Δp: 0.595% | |
Median Δp: -0.023% | |
25.0% Δp: -2.797% | |
10.0% Δp: -17.572% | |
5.0% Δp: -33.515% | |
1.0% Δp: -71.845% | |
0.1% Δp: -94.533% | |
Minimum Δp: -99.575% | |
RMS Δp : 16.773 ± 0.071 % | |
Same top p: 74.363 ± 0.112 % | |
llama_perf_context_print: load time = 79800.46 ms | |
llama_perf_context_print: prompt eval time = 1798108.56 ms / 304128 tokens ( 5.91 ms per token, 169.14 tokens per second) | |
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second) | |
llama_perf_context_print: total time = 1960563.16 ms / 304129 tokens | |