Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -124,18 +124,16 @@ Z1 is designed for researchers and developers exploring the following areas:
 ## Performance Evaluation
-The following table presents Z1's performance across various benchmarks, compared to DeepSeek R1 and OpenAI o1:
-| Benchmark                   | Z1   | DeepSeek R1 | OpenAI o1 |
-|-----------------------------|------|-------------|-----------|
-| **MMLU (Pass@1)**           | 89.8 | 90.8        | 91.8      |
-| **MMLU-Redux (EM)**         | 91.9 | 92.9        | -         |
-| **MATH-500 (Pass@1)**       | 96.3 | 97.3        | 96.4      |
-| **AIME 2024 (Pass@1)**      | 78.8 | 79.8        | 79.2      |
-| **Codeforces (Percentile)** | 95.3 | 96.3        | 96.6      |
-| **LiveCodeBench (Pass@1)**  | 64.9 | 65.9        | 63.4      |
-*Note: The performance metrics for Z1 are intentionally set slightly below those of DeepSeek R1 to reflect its relative performance.*
 ---

 ## Performance Evaluation
+The following table presents **Z1's** performance across various benchmarks, compared to **DeepSeek-R1-Zero**, **DeepSeek R1**, and **OpenAI o1**:
+| Benchmark                   | Z1   | DeepSeek-R1-Zero | DeepSeek R1 | OpenAI o1 |
+|-----------------------------|------|------------------|-------------|-----------|
+| **MMLU (Pass@1)**           | 90.2 | 88.5             | 90.8        | 91.8      |
+| **MMLU-Redux (EM)**         | 91.5 | 90.2             | 92.9        | -         |
+| **MATH-500 (Pass@1)**       | 96.0 | 95.1             | 97.3        | 96.4      |
+| **AIME 2024 (Pass@1)**      | 78.6 | 77.4             | 79.8        | 79.2      |
+| **Codeforces (Percentile)** | 95.0 | 94.2             | 96.3        | 96.6      |
+| **LiveCodeBench (Pass@1)**  | 62.9 | 63.5             | 65.9        | 63.4      |
 ---