Invalid LLM Leaderboard results

#1
by cmp-nct - opened

This model received a math score of 0%, I've checked the resultset and this is not due to wrong answers. The answers I checked all were correct.
The leaderboard was not capable to parse the formating (boxed() ) answers. This model would score significantly higher with proper leaderboard results.

thanks for letting me know, apparently may have they addressed this:, see:

https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard/discussions/1102

CombinHorizon changed discussion status to closed

Sign up or log in to comment