[Ref, Fix] indentation error in answer key selection, longer explanation in demo, exclusion of broken dataset c608f7f Joschka Strueber commited on 14 days ago
[Add] add bbh and gpqa benchmarks again with correct answer_index selection 0a42e99 Joschka Strueber commited on 15 days ago
[Ref] apply custom css to heatmap, increase size of images 4077e51 Joschka Strueber commited on 15 days ago
[Ref, Add] custom css for sizing, move demo utility to its own file bd28414 Joschka Strueber commited on 15 days ago
[Add, Ref] Add more info and table on metric, move model list to data/ b90e0d3 Joschka Strueber commited on 15 days ago
[Fix, Debug] wrong default model, check filter_labels 4b2993a Joschka Strueber commited on 15 days ago
[Ref, Add] change default models, remove sorting in plot 8be99c0 Joschka Strueber commited on 15 days ago
[Add, Fix] add loading mechanism for cached models, change error to warning when computing heatmap 93d753c Joschka Strueber commited on 15 days ago
[Add, Fix] better warnings for missing models, better description 35404bc Joschka Strueber commited on 16 days ago
[Add, Ref] integrate similarity computation, fix one-hot for EC, add login option 0f7de99 Joschka Strueber commited on 16 days ago
[Add] load models and datasets from hub, compute similarities a48b15f Joschka Strueber commited on 17 days ago
[Add, Fix] fix clearing model list, improve axes labels 36159b1 Joschka Strueber commited on 17 days ago