Update README.md
Browse files
README.md
CHANGED
@@ -51,7 +51,9 @@ We evaluated FalconLite against benchmarks that are specifically designed to ass
|
|
51 |
| [Question Answering with Long Input Texts](https://nyu-mll.github.io/quality/) | 46.9% | 40.8% |
|
52 |
|
53 |
### Performance ###
|
54 |
-
**metrics** = the average number of generated tokens per second (TPS) =
|
|
|
|
|
55 |
|
56 |
The `end-to-end-response-time` = when the last token is generated - when the inference request is received
|
57 |
|
|
|
51 |
| [Question Answering with Long Input Texts](https://nyu-mll.github.io/quality/) | 46.9% | 40.8% |
|
52 |
|
53 |
### Performance ###
|
54 |
+
**metrics** = the average number of generated tokens per second (TPS) =
|
55 |
+
|
56 |
+
`nb-generated-tokens` / `end-to-end-response-time`
|
57 |
|
58 |
The `end-to-end-response-time` = when the last token is generated - when the inference request is received
|
59 |
|