Commit History
Add WikiQA
fa8bb65
fix: show partial results even if some evaluations haven't finished
7fdb5f5
fix: read request information even if eval is running
b61f534
switch to flat inflection benchmark
8874217
add submission instructions to about page
80793c6
remove submit tab
117d89c
fix: filtering support for models missing details
5e8e87c
remove intro text and citation block
dcb54b6
add benchmark descriptions and links to About page
67a665c
add winogrande and arc-challenge
56926f2
skip model detail validation for OAI/Anthropic models
4ec9008
fix typo in metric name
b1416b0
remove debug prints
9e6a3bf
fix metric name
a0ee03a
add debug prints
105e1f2
revert to correct usage of ModelDetails (without api)
24c8d00
debug print
ee4b341
debug print
a5c094b
verified
debug print
decb818
verified
debug print
6a989eb
verified
debug print
427f12d
verified
debug print
ea10299
verified
Added empty default for api in ModelDetails
e8f05cc
verified
Added model API to submission screen
20fd601
verified
add Icelandic evals
9ef7f1a
verified
Change metric string
96f9cbe
verified
Comment out winogrande for debugging
ab6318a
verified
Change title
4d276e3
verified
Change title
2a3757e
verified
Make name for HF token explicit
bd503b0
verified
Fix repo names
c9a0e12
verified
Update src/envs.py
d7e7ffd
verified
doc
c1b8a96
Clémentine
commited on
simplified the template
24622c4
Clémentine
commited on
CPU, TOKEN, env variables (#4)
55cc480
verified
Update src/submission/check_validity.py
6eb8bfd
made token a requirement
f982b8e
Clémentine
commited on
test
f0298e1
Clémentine
commited on
fix
c15e77e
Clémentine
commited on
removed quantization to simplify
b899767
Clémentine
commited on
now with a functionning backend
1ffc326
Clémentine
commited on
update read
943f952
Clémentine
commited on
fixs
314f91a
Clémentine
commited on
updated leaderboard
efeee6d
Clémentine
commited on
Simplified leaderboard v0
9833cdb
Clémentine
commited on
simplified some parts of the code + updated requirements
9d22eee
Clémentine
commited on
Added check on tokenizer to prevent submissions which won't run
7302987
Clémentine
commited on
Update benchmark count and fix typo (`inetuning->finetuning`) (#395)
7abc6a7
fix order of request file vs request file list, to avoid resubmitting issues
976f398
Clémentine
commited on