Running 49 49 R1-distilled leaderboard β‘ Generate a leaderboard for open-r1 models across benchmarks