cnmoro commited on
Commit
1f93c19
·
verified ·
1 Parent(s): 6f6c11a

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +138 -0
README.md CHANGED
@@ -80,4 +80,142 @@ response
80
  # No geral, os LLMs estão se tornando cada vez mais importantes à medida que a tecnologia continua a
81
  # avançar. À medida que continuamos a usar LLMs em nossas vidas diárias, podemos esperar ver ainda
82
  # mais desenvolvimentos interessantes no futuro.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ```
 
80
  # No geral, os LLMs estão se tornando cada vez mais importantes à medida que a tecnologia continua a
81
  # avançar. À medida que continuamos a usar LLMs em nossas vidas diárias, podemos esperar ver ainda
82
  # mais desenvolvimentos interessantes no futuro.
83
+ ```
84
+
85
+ ```md
86
+ ## Overall Results
87
+
88
+ | Task | Metric | Value | StdErr |
89
+ |---------------------------|---------------|---------|---------|
90
+ | ASSIN2 RTE | F1 Macro | 0.4486 | 0.0067 |
91
+ | ASSIN2 RTE | Accuracy | 0.5560 | 0.0071 |
92
+ | ASSIN2 STS | Pearson | 0.4091 | 0.0104 |
93
+ | ASSIN2 STS | MSE | 5.6395 | N/A |
94
+ | BluEX | Accuracy | 0.2503 | 0.0094 |
95
+ | ENEM Challenge | Accuracy | 0.3128 | 0.0071 |
96
+ | FAQUAD NLI | F1 Macro | 0.4611 | 0.0094 |
97
+ | FAQUAD NLI | Accuracy | 0.7877 | 0.0113 |
98
+ | HateBR Offensive (Binary) | F1 Macro | 0.3439 | 0.0049 |
99
+ | HateBR Offensive (Binary) | Accuracy | 0.4857 | 0.0095 |
100
+ | OAB Exams | Accuracy | 0.3062 | 0.0057 |
101
+ | Portuguese Hate Speech (Binary) | F1 Macro | 0.4119 | 0.0038 |
102
+ | Portuguese Hate Speech (Binary) | Accuracy | 0.7004 | 0.0111 |
103
+ | TweetSentBR | F1 Macro | 0.5055 | 0.0078 |
104
+ | TweetSentBR | Accuracy | 0.5697 | 0.0078 |
105
+
106
+ ## Detailed Results by Task
107
+
108
+ ### ASSIN2 RTE
109
+
110
+ | Metric | Value | StdErr |
111
+ |-------------|---------|---------|
112
+ | F1 Macro | 0.4486 | 0.0067 |
113
+ | Accuracy | 0.5560 | 0.0071 |
114
+
115
+ ### ASSIN2 STS
116
+
117
+ | Metric | Value | StdErr |
118
+ |-------------|---------|---------|
119
+ | Pearson | 0.4091 | 0.0104 |
120
+ | MSE | 5.6395 | N/A |
121
+
122
+ ### BluEX
123
+
124
+ | Exam ID | Metric | Value | StdErr |
125
+ |-------------------|----------|---------|---------|
126
+ | All | Accuracy | 0.2503 | 0.0094 |
127
+ | USP_2018 | Accuracy | 0.2037 | 0.0315 |
128
+ | UNICAMP_2018 | Accuracy | 0.1852 | 0.0306 |
129
+ | UNICAMP_2021_1 | Accuracy | 0.0870 | 0.0240 |
130
+ | USP_2020 | Accuracy | 0.2143 | 0.0317 |
131
+ | USP_2023 | Accuracy | 0.2045 | 0.0350 |
132
+ | UNICAMP_2019 | Accuracy | 0.2600 | 0.0358 |
133
+ | USP_2019 | Accuracy | 0.1500 | 0.0326 |
134
+ | UNICAMP_2020 | Accuracy | 0.2182 | 0.0321 |
135
+ | UNICAMP_2021_2 | Accuracy | 0.2941 | 0.0367 |
136
+ | UNICAMP_2023 | Accuracy | 0.4186 | 0.0433 |
137
+ | UNICAMP_2024 | Accuracy | 0.3111 | 0.0398 |
138
+ | USP_2024 | Accuracy | 0.2683 | 0.0398 |
139
+ | USP_2021 | Accuracy | 0.3269 | 0.0375 |
140
+ | UNICAMP_2022 | Accuracy | 0.3590 | 0.0444 |
141
+ | USP_2022 | Accuracy | 0.2857 | 0.0370 |
142
+
143
+ ### ENEM Challenge
144
+
145
+ | Exam ID | Metric | Value | StdErr |
146
+ |-----------|----------|---------|---------|
147
+ | All | Accuracy | 0.3128 | 0.0071 |
148
+ | 2017 | Accuracy | 0.2845 | 0.0241 |
149
+ | 2016 | Accuracy | 0.2479 | 0.0226 |
150
+ | 2016_2 | Accuracy | 0.2846 | 0.0235 |
151
+ | 2022 | Accuracy | 0.3534 | 0.0240 |
152
+ | 2012 | Accuracy | 0.3362 | 0.0253 |
153
+ | 2011 | Accuracy | 0.3333 | 0.0251 |
154
+ | 2010 | Accuracy | 0.3846 | 0.0260 |
155
+ | 2014 | Accuracy | 0.3211 | 0.0259 |
156
+ | 2009 | Accuracy | 0.2696 | 0.0239 |
157
+ | 2015 | Accuracy | 0.2521 | 0.0229 |
158
+ | 2023 | Accuracy | 0.3481 | 0.0236 |
159
+ | 2013 | Accuracy | 0.3333 | 0.0261 |
160
+
161
+ ### FAQUAD NLI
162
+
163
+ | Metric | Value | StdErr |
164
+ |-------------|---------|---------|
165
+ | F1 Macro | 0.4611 | 0.0094 |
166
+ | Accuracy | 0.7877 | 0.0113 |
167
+
168
+ ### HateBR Offensive (Binary)
169
+
170
+ | Metric | Value | StdErr |
171
+ |-------------|---------|---------|
172
+ | F1 Macro | 0.3439 | 0.0049 |
173
+ | Accuracy | 0.4857 | 0.0095 |
174
+
175
+ ### OAB Exams
176
+
177
+ | Exam ID | Metric | Value | StdErr |
178
+ |-------------|----------|---------|---------|
179
+ | All | Accuracy | 0.3062 | 0.0057 |
180
+ | 2011-05 | Accuracy | 0.3375 | 0.0304 |
181
+ | 2012-06a | Accuracy | 0.2625 | 0.0285 |
182
+ | 2010-02 | Accuracy | 0.3700 | 0.0279 |
183
+ | 2017-22 | Accuracy | 0.3500 | 0.0309 |
184
+ | 2016-20 | Accuracy | 0.3125 | 0.0300 |
185
+ | 2011-03 | Accuracy | 0.2626 | 0.0255 |
186
+ | 2015-17 | Accuracy | 0.3205 | 0.0304 |
187
+ | 2017-23 | Accuracy | 0.2875 | 0.0292 |
188
+ | 2018-25 | Accuracy | 0.3625 | 0.0311 |
189
+ | 2016-19 | Accuracy | 0.2436 | 0.0281 |
190
+ | 2017-24 | Accuracy | 0.1625 | 0.0238 |
191
+ | 2015-16 | Accuracy | 0.3125 | 0.0300 |
192
+ | 2011-04 | Accuracy | 0.3250 | 0.0301 |
193
+ | 2012-07 | Accuracy | 0.3500 | 0.0307 |
194
+ | 2012-06 | Accuracy | 0.1875 | 0.0253 |
195
+ | 2012-09 | Accuracy | 0.2468 | 0.0284 |
196
+ | 2013-12 | Accuracy | 0.3625 | 0.0311 |
197
+ | 2013-11 | Accuracy | 0.3000 | 0.0295 |
198
+ | 2010-01 | Accuracy | 0.3412 | 0.0296 |
199
+ | 2015-18 | Accuracy | 0.2875 | 0.0292 |
200
+ | 2014-13 | Accuracy | 0.3500 | 0.0308 |
201
+ | 2013-10 | Accuracy | 0.3125 | 0.0300 |
202
+ | 2016-20a | Accuracy | 0.2500 | 0.0279 |
203
+ | 2014-14 | Accuracy | 0.3125 | 0.0301 |
204
+ | 2012-08 | Accuracy | 0.3000 | 0.0296 |
205
+ | 2016-21 | Accuracy | 0.3375 | 0.0304 |
206
+ | 2014-15 | Accuracy | 0.4103 | 0.0321 |
207
+
208
+ ### Portuguese Hate Speech (Binary)
209
+
210
+ | Metric | Value | StdErr |
211
+ |-------------|---------|---------|
212
+ | F1 Macro | 0.4119 | 0.0038 |
213
+ | Accuracy | 0.7004 | 0.0111 |
214
+
215
+ ### TweetSentBR
216
+
217
+ | Metric | Value | StdErr |
218
+ |-------------|---------|---------|
219
+ | F1 Macro | 0.5055 | 0.0078 |
220
+ | Accuracy | 0.5697 | 0.0078 |
221
  ```