Q2_K_XL model is the best? IQ2_XXS is better than Q2_K_XL in mmlu-pro benchmark

#36

by albertchow - opened about 22 hours ago

Discussion

albertchow

about 22 hours ago

YudingOOM

about 15 hours ago

where can we get the mmlu-pro benchmark for Q2_K_XL and IQ2_XXS?

Rotating

about 15 hours ago

•

edited about 15 hours ago

Without the data we can't tell. It could just be randomness because we're running R1 at 0.6 temp and it's doing reasoning.

albertchow

about 9 hours ago

•

edited about 9 hours ago

well, I only tested computer science (410 questions) with zero-shot by following usage recommendations mentioned in DeepSeek-R1 page, the score of Q2_K_XL is 80.24, and the score of IQ2_XXS is 84.15

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment