Q2_K_XL model is the best? IQ2_XXS is better than Q2_K_XL in mmlu-pro benchmark

#36
by albertchow - opened

image.png

where can we get the mmlu-pro benchmark for Q2_K_XL and IQ2_XXS?

Without the data we can't tell. It could just be randomness because we're running R1 at 0.6 temp and it's doing reasoning.

well, I only tested computer science (410 questions) with zero-shot by following usage recommendations mentioned in DeepSeek-R1 page, the score of Q2_K_XL is 80.24, and the score of IQ2_XXS is 84.15

Sign up or log in to comment