Spaces:
Running
Running
File size: 2,443 Bytes
8dc4b22 c8763bd 67cbded bee5389 ad5bd56 81f5492 ad5bd56 c8763bd 9dc4521 c382b2a e747f4e c382b2a 0564b52 9e3eaf4 e0ef314 9e3eaf4 e0ef314 df1a500 67b4a03 bee5389 b869fcb 2ff4a74 9dc4521 bee5389 9dc4521 2ff4a74 00642fb ad5bd56 9dc4521 bee5389 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 |
TITLE = """<h1 align="center" id="space-title">π€ Open LLM-Perf Leaderboard ποΈ</h1>"""
INTRODUCTION_TEXT = f"""
The π€ Open LLM-Perf Leaderboard ποΈ aims to benchmark the performance (latency & throughput) of Large Language Models (LLMs) with different hardwares, backends and optimizations using [Optimum-Benchmark](https://github.com/huggingface/optimum-benchmark) and [Optimum](https://github.com/huggingface/optimum) flavors.
Anyone from the community can request a model or a hardware/backend/optimization configuration for automated benchmarking:
- Model evaluation requests should be made in the [π€ Open LLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard) and will be added to the π€ Open LLM-Perf Leaderboard ποΈ automatically.
- Hardware/Backend/Optimization performance requests should be made in the [community discussions](https://huggingface.co/spaces/optimum/llm-perf-leaderboard/discussions) to assess their relevance and feasibility.
"""
ABOUT_TEXT = """<h3>About the π€ Open LLM-Perf Leaderboard ποΈ</h3>
<ul>
<li>To avoid communication-dependent results, only one GPU is used.</li>
<li>LLMs are evaluated on a singleton batch and generating 1000 tokens.</li>
<li>Peak memory is measured in MB during the first forward pass of the LLM (no warmup).</li>
<li>Each pair of (Model Type, Weight Class) is represented by the best scored model. This LLM is the one used for all the hardware/backend/optimization experiments.</li>
<li>Score is the average evaluation score obtained from the <a href="https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard">π€ Open LLM Leaderboard</a>.</li>
<li>Ranking is based on the euclidean distance from the "Perfect LLM" (i.e. 0 latency and 100% accuracy).</li>
</ul>
"""
CITATION_BUTTON_LABEL = "Copy the following snippet to cite these results."
CITATION_BUTTON_TEXT = r"""@misc{open-llm-perf-leaderboard,
author = {Ilyas Moutawwakil, RΓ©gis Pierrard},
title = {Open LLM-Perf Leaderboard},
year = {2023},
publisher = {Hugging Face},
howpublished = "\url{https://huggingface.co/spaces/optimum/llm-perf-leaderboard}",
@software{optimum-benchmark,
author = {Ilyas Moutawwakil, RΓ©gis Pierrard},
publisher = {Hugging Face},
title = {Optimum-Benchmark: A framework for benchmarking the performance of Transformers models with different hardwares, backends and optimizations.},
}
"""
|