shenzhi-wang commited on
Commit
94d7d87
Β·
verified Β·
1 Parent(s): 5b3bc73

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -2
README.md CHANGED
@@ -92,7 +92,9 @@ print(response)
92
 
93
  ### 3.1 Arena-Hard-Auto-v0.1
94
 
95
- All results below, except those for `Xwen-72B-Chat`, are sourced from [Arena-Hard-Auto](https://github.com/lmarena/arena-hard-auto) (accessed on February 1, 2025).
 
 
96
 
97
  #### 3.1.1 No Style Control
98
 
@@ -100,9 +102,11 @@ All results below, except those for `Xwen-72B-Chat`, are sourced from [Arena-Har
100
 
101
  | | Score | 95% CIs |
102
  | --------------------------------- | ------------------------ | ----------- |
103
- | **Xwen-72B-Chat** πŸ”‘ | **86.1** (Top-1 among πŸ”‘) | (-1.5, 1.7) |
104
  | Qwen2.5-72B-Instruct πŸ”‘ | 78.0 | (-1.8, 1.8) |
105
  | Athene-v2-Chat πŸ”‘ | 85.0 | (-1.4, 1.7) |
 
 
106
  | Llama-3.1-Nemotron-70B-Instruct πŸ”‘ | 84.9 | (-1.7, 1.8) |
107
  | Llama-3.1-405B-Instruct-FP8 πŸ”‘ | 69.3 | (-2.4, 2.2) |
108
  | Claude-3-5-Sonnet-20241022 πŸ”’ | 85.2 | (-1.4, 1.6) |
 
92
 
93
  ### 3.1 Arena-Hard-Auto-v0.1
94
 
95
+ All results below, except those for `Xwen-72B-Chat`, `DeepSeek-V3` and `DeepSeek-R1`, are sourced from [Arena-Hard-Auto](https://github.com/lmarena/arena-hard-auto) (accessed on February 1, 2025).
96
+
97
+ The results of `DeepSeek-V3` and `DeepSeek-R1` are borrowed from their officially reported results.
98
 
99
  #### 3.1.1 No Style Control
100
 
 
102
 
103
  | | Score | 95% CIs |
104
  | --------------------------------- | ------------------------ | ----------- |
105
+ | **Xwen-72B-Chat** πŸ”‘ | **86.1** (Top-1 among πŸ”‘ below 100B) | (-1.5, 1.7) |
106
  | Qwen2.5-72B-Instruct πŸ”‘ | 78.0 | (-1.8, 1.8) |
107
  | Athene-v2-Chat πŸ”‘ | 85.0 | (-1.4, 1.7) |
108
+ | DeepSeek-V3 **(671B >> 72B)** πŸ”‘ | 85.5 | N/A |
109
+ | DeepSeek-R1 **(671B >> 72B)** πŸ”‘ | **92.3** (Top-1 among πŸ”‘) | N/A |
110
  | Llama-3.1-Nemotron-70B-Instruct πŸ”‘ | 84.9 | (-1.7, 1.8) |
111
  | Llama-3.1-405B-Instruct-FP8 πŸ”‘ | 69.3 | (-2.4, 2.2) |
112
  | Claude-3-5-Sonnet-20241022 πŸ”’ | 85.2 | (-1.4, 1.6) |