Failed orca_mini_v8_* Evaluation

#1051
by pankajmathur - opened

Opening new discussion, as suggested in previous comment on another discussion:

Hi @alozowski ,

Happy Monday, just reaching out to make sense out of following eval requests commits for model "pankajmathur/orca_mini_v8_0_70b", the below commit shows file rename and changes from wrong "params": 35.277,
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/5660c4c4b9156fa0f15d99be7eee061d5de24764#d2h-741276
Does the model failed to evaluate and these changes reflect re submission for evaluation again?

If it is true, can we submit "pankajmathur/orca_mini_v8_1_70b" again too, as It shows it is failed too?
https://huggingface.co/datasets/open-llm-leaderboard/requests/commit/8b40ba212c48dc470be4f661b67cc085ed456477#d2h-702908

Is there any reason they are failing? Just for background, I have successfully evaluated both of them on my own servers, before submitting them to HF Open LLM LB, using:

https://huggingface.co/docs/leaderboards/open_llm_leaderboard/about#reproducibility

lm_eval --model hf --model_args pretrained=pankajmathur/orca_mini_v8_1_70b,dtype=bfloat16,parallelize=True --tasks leaderboard --output_path lm_eval_results/leaderboard --batch_size auto

and these results are now updated for both model cards:
https://huggingface.co/pankajmathur/orca_mini_v8_0_70b
https://huggingface.co/pankajmathur/orca_mini_v8_1_70b

Again, thanks again for helping out on this really appreciated.

Regards,
Pankaj

Hi @alozowski and Team,
Please any update on this. Mostly Why all of above model evaluation failed and if there is any way to rerun them.

Sign up or log in to comment