Commit History

fix: description text
b9777d9
Running

gardarjuto commited on

fix: show partial results even if some evaluations haven't finished
7fdb5f5

gardarjuto commited on

fix: read request information even if eval is running
b61f534

gardarjuto commited on

Update app.py
9a10727
verified

gardari commited on

switch to flat inflection benchmark
8874217

gardarjuto commited on

add wrapping to leaderboard
a5bd804

gardarjuto commited on

add submission instructions to about page
80793c6

gardarjuto commited on

remove submit tab
117d89c

gardarjuto commited on

Update app.py
9b8b426
verified

gardari commited on

debug restart interval
fdb1fcf
verified

gardari commited on

fix: type hints for styling function
0be9d2f

gardarjuto commited on

Factor out floating point styling to a function
90021e9

gardarjuto commited on

fix: filtering support for models missing details
5e8e87c

gardarjuto commited on

remove intro text and citation block
dcb54b6

gardarjuto commited on

add benchmark descriptions and links to About page
67a665c

gardarjuto commited on

Increase floating point number in benchmark metrics
7fcf611

gardarjuto commited on

add winogrande and arc-challenge
56926f2

gardarjuto commited on

show private models by default
2bd1158

gardarjuto commited on

skip model detail validation for OAI/Anthropic models
4ec9008

gardarjuto commited on

fix typo in metric name
b1416b0

gardarjuto commited on

remove debug prints
9e6a3bf

gardarjuto commited on

add debug prints
105e1f2

gardarjuto commited on

revert to correct usage of ModelDetails (without api)
24c8d00

gardarjuto commited on

debug print
a5c094b
verified

gardari commited on

debug print
decb818
verified

gardari commited on

debug print
6a989eb
verified

gardari commited on

debug print
427f12d
verified

gardari commited on

debug print
ea10299
verified

gardari commited on

Added empty default for api in ModelDetails
e8f05cc
verified

gardari commited on

Added model API to submission screen
20fd601
verified

gardari commited on

add Icelandic evals
9ef7f1a
verified

gardari commited on

switch to mideind's fork of Eval Harness
da87917
verified

gardari commited on

Change metric string
96f9cbe
verified

gardari commited on

Comment out winogrande for debugging
ab6318a
verified

gardari commited on

Add task
839d7dc
verified

gardari commited on

Change title
4d276e3
verified

gardari commited on

Change title
2a3757e
verified

gardari commited on

Change title
72a1baf
verified

gardari commited on

Make name for HF token explicit
bd503b0
verified

gardari commited on

Fix repo names
c9a0e12
verified

gardari commited on

Update src/envs.py
d7e7ffd
verified

gardari commited on

Update requirements.txt
bcc83eb
verified

clefourrier HF staff commited on

Update README.md
d0f181a
verified

clefourrier HF staff commited on

Update app.py
84582a1
verified

clefourrier HF staff commited on

doc
c1b8a96

Clémentine commited on

more info README
910a08e

Clémentine commited on