Update README.md
Browse files
README.md
CHANGED
@@ -45,8 +45,10 @@ The following models were included in the merge:
|
|
45 |
[Detailed Results + Failed GSM8K](https://huggingface.co/datasets/open-llm-leaderboard/details_ABX-AI__Silver-Sun-v2-11B)
|
46 |
|
47 |
|
48 |
-
>[NOTE]
|
|
|
49 |
>By removing the GSM8K score, the average is VERY close to upstage/SOLAR-10.7B-v1.0 (74.20), which would make sense.
|
|
|
50 |
|
51 |
| Metric |Value|
|
52 |
|---------------------------------|----:|
|
|
|
45 |
[Detailed Results + Failed GSM8K](https://huggingface.co/datasets/open-llm-leaderboard/details_ABX-AI__Silver-Sun-v2-11B)
|
46 |
|
47 |
|
48 |
+
>[!NOTE]
|
49 |
+
>I had to remove GSM8K from the results and manually re-average the rest. GSM8K failed inexplicably, and it should not have.
|
50 |
>By removing the GSM8K score, the average is VERY close to upstage/SOLAR-10.7B-v1.0 (74.20), which would make sense.
|
51 |
+
>Feel free to ignore the actual average and use the other scores individually for reference.
|
52 |
|
53 |
| Metric |Value|
|
54 |
|---------------------------------|----:|
|