Daemontatox
commited on
Commit
•
e799c68
1
Parent(s):
3717c63
Update README.md
Browse files
README.md
CHANGED
@@ -35,18 +35,6 @@ This model is intended for research and development purposes related to text gen
|
|
35 |
|
36 |
**Focus on Reasoning:** The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.
|
37 |
|
38 |
-
### Performance Metrics
|
39 |
-
|
40 |
-
RA_Reasoner achieves **15% higher scores** than ChatGPT-O1 Mini on key benchmarks:
|
41 |
-
|
42 |
-
| Benchmark | Metric | RA_Reasoner | ChatGPT-O1 Mini | Improvement |
|
43 |
-
|-------------------------|--------------------------|-------------|-----------------|-------------|
|
44 |
-
| MMLU | Average Accuracy | 0.495 | 0.43 | +15% |
|
45 |
-
| BigBench Hard | Average Accuracy | 0.414 | 0.36 | +15% |
|
46 |
-
| HellaSwag | Average Accuracy | 0.805 | 0.70 | +15% |
|
47 |
-
| GSM8k | Average Accuracy | 0.322 | 0.28 | +15% |
|
48 |
-
|
49 |
-
These benchmarks highlight RA_Reasoner's superior performance in reasoning, logic, and understanding tasks.
|
50 |
|
51 |
---
|
52 |
|
|
|
35 |
|
36 |
**Focus on Reasoning:** The fine-tuning has been geared towards enhancing the model's ability to tackle reasoning challenges and logic-based tasks.
|
37 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
38 |
|
39 |
---
|
40 |
|