alpayariyak commited on
Commit
9603be0
β€’
1 Parent(s): b9a70f8

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +32 -45
README.md CHANGED
@@ -71,36 +71,25 @@ pinned: false
71
 
72
 
73
  <hr>
74
- <p align="center" style="margin-top: 0px; font-size: 1.2em; background-color: #3c72db; padding: 0.5em; border-radius: 0.5em; color: white; font-weight: bold;">
75
- <a href="https://huggingface.co/openchat/openchat_3.5" style="text-decoration: none; color: white;">
76
- <span style="font-size: 1.4em; font-family: 'Helvetica'; letter-spacing: 0.2em">OPENCHAT</span>
77
- <span style="font-size: 1.4em; font-family: 'Helvetica'; background-color: white; padding: 0.2em; border-radius: 0.3em; color: #3c72db;"> 3.5 </span>
78
- <br>
79
- <span>
80
- First 7B Model that Achieves ChatGPT-Level Performance
81
- <br>#1 Open-Source Model on MT-bench scoring 7.81, outperforming 70B models
 
 
 
 
82
  </span>
83
  </a>
84
- <!-- <a href="https://huggingface.co/openchat/openchat_3.5">
85
- <button class="common-button">Model Repo</button>
86
- </a>
87
- <a href="https://openchat.team">
88
- <button class="common-button">OpenChatUI Demo</button>
89
- </a>
90
- <a href="https://huggingface.co/spaces/openchat/openchat_3.5">
91
- <button class="common-button">HuggingFace Space</button>
92
- </a>
93
- <a href="https://arxiv.org/pdf/2309.11235.pdf">
94
- <button class="common-button">Paper</button>
95
- </a>
96
- -->
97
- </p>
98
-
99
 
100
- <div align="center" style="justify-content: center; align-items: center; "'>
101
- <img src="https://github.com/alpayariyak/openchat/blob/master/assets/3.5-benchmarks.png?raw=true" style="width: 100%; border-radius: 0.5em">
102
- </div>
103
- </p>
104
 
105
  <h1 style="vertical-align: middle;">
106
  <img src="https://github.com/alpayariyak/openchat/blob/master/assets/logo_nobg.png?raw=true" alt="OpenChat Logo" style="width:20px; vertical-align: middle; display: inline-block; margin-right: 5px; margin-left: 0px; margin-top: 0px; margin-bottom: 0px;"/>About OpenChat
@@ -120,26 +109,24 @@ pinned: false
120
 
121
  # πŸ“Š Benchmarks
122
 
123
- | Model | # Params | Average | MT-Bench | AGIEval | BBH MC | TruthfulQA | MMLU | HumanEval | BBH CoT | GSM8K |
124
- |--------------------|----------|----------|--------------|----------|----------|---------------|--------------|-----------------|-------------|--------------|
125
- | OpenChat-3.5 | **7B** | **61.6** | 7.81 | **47.4** | **47.6** | **59.1** | 64.3 | **55.5** | 63.5 | **77.3** |
126
- | ChatGPT (March)* | ? | 61.5 | **7.94** | 47.1 | **47.6** | 57.7 | **67.3** | 48.1 | **70.1** | 74.9 |
127
- | | | | | | | | | | | |
128
- | OpenHermes 2.5 | 7B | 59.3 | 7.54 | 46.5 | 49.4 | 57.5 | 63.8 | 48.2 | 59.9 | 73.5 |
129
- | OpenOrca Mistral | 7B | 52.7 | 6.86 | 42.9 | 49.4 | 45.9 | 59.3 | 38.4 | 58.1 | 59.1 |
130
- | Zephyr-Ξ²^ | 7B | 34.6 | 7.34 | 39.0 | 40.6 | 40.8 | 39.8 | 22.0 | 16.0 | 5.1 |
131
- | Mistral** | 7B | - | 6.84 | 38.0 | 39.0 | - | 60.1 | 30.5 | - | 52.2 |
132
- | Open-source SOTA** | 13B-70B | 61.4 | 7.71 | 41.7 | 49.7 | 62.3 | 63.7 | 73.2 | 41.4 | 82.3 |
133
- | | | | WizardLM 70B | Orca 13B | Orca 13B | Platypus2 70B | WizardLM 70B | WizardCoder 34B | Flan-T5 11B | MetaMath 70B |
134
-
135
-
136
  ## 𝕏 Comparison with [X.AI Grok](https://x.ai/)
137
 
138
- | | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
139
- |--------------|-------------|---------|----------|------|-----------|----------|----------|
140
- | OpenChat 3.5 | Apache-2.0 | 7B | **56.4** | 64.3 | 55.5 | **28.6** | **77.3** |
141
- | Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
142
- | Grok-1 | Proprietary | ? | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
 
143
 
144
  # πŸ’ŒContact
145
 
 
71
 
72
 
73
  <hr>
74
+ <div style="background-color: white; padding: 0.7em; border-radius: 0.5em; color: black; display: flex; flex-direction: column; justify-content: center; text-align: center; ont-size: 0.5em;">
75
+ <a href="https://huggingface.co/openchat/openchat_3.5" style="text-decoration: none; color: black;">
76
+ <span style="font-size: 1.7em; font-family: 'Helvetica'; letter-spacing: 0.1em; font-weight: bold; color: black;">OPENCHAT</span><span style="font-size: 1.8em; font-family: 'Helvetica'; color: #3c72db; ">3.5</span>
77
+ <span style="font-size: 0.7em; font-family: 'Helvetica'; color: white; vertical-align: top; background-color:red; border-radius: 6em; padding: 0.066em 0.4em; letter-spacing: 0.1em; font-weight: bold;">1210</span>
78
+ <span style="font-size: 0.85em; font-family: 'Helvetica'; color: black;">
79
+ <br> πŸ† The Overall Best Performing Open Source 7B Model πŸ†
80
+ <br> πŸ€– Outperforms <span style="font-weight: bold;">ChatGPT</span> (March) and <span style="font-weight: bold;">Grok-1</span> πŸ€–
81
+ <br> πŸš€<span style="font-size: 1em; font-family: 'Helvetica'; color: black; font-weight: bold;">15</span>-point improvement in Coding over <span style="font-size: 0.9em;
82
+ font-family: 'Helvetica'; color: black; font-weight: bold;">OpenChat-3.5πŸš€</span>
83
+ <br><br><span style="font-size: 1em; font-family: 'Helvetica'; color: #3c72db; font-weight: bold;">New Features</span>
84
+ <br> πŸ’‘ 2 Modes: Coding + Generalist, Mathematical Reasoning πŸ’‘
85
+ <br> πŸ§‘β€βš–οΈ Experimental support for Evaluator and Feedback capabilities πŸ§‘β€βš–οΈ
86
  </span>
87
  </a>
88
+ </div>
 
 
 
 
 
 
 
 
 
 
 
 
 
 
89
 
90
+ <div style="display: flex; justify-content: center; align-items: center">
91
+ <img src="https://github.com/alpayariyak/openchat/blob/master/assets/1210bench.png?raw=true" style="width: 100%; border-radius: 1em">
92
+ </div>
 
93
 
94
  <h1 style="vertical-align: middle;">
95
  <img src="https://github.com/alpayariyak/openchat/blob/master/assets/logo_nobg.png?raw=true" alt="OpenChat Logo" style="width:20px; vertical-align: middle; display: inline-block; margin-right: 5px; margin-left: 0px; margin-top: 0px; margin-bottom: 0px;"/>About OpenChat
 
109
 
110
  # πŸ“Š Benchmarks
111
 
112
+ | Model | # Params | Average | MT-Bench | HumanEval | BBH MC | AGIEval | TruthfulQA | MMLU | GSM8K | BBH CoT |
113
+ |--------------------|----------|----------|--------------|-----------------|----------|----------|---------------|--------------|--------------|-------------|
114
+ | OpenChat-3.5-1210 | **7B** | **63.8** | 7.76 | **68.9** | **49.5** | **48.0** | **61.8** | 65.3 | **77.3** | 61.8 |
115
+ | OpenChat-3.5 | **7B** | 61.6 | 7.81 | 55.5 | 47.6 | 47.4 | 59.1 | 64.3 | **77.3** | 63.5 |
116
+ | ChatGPT (March)* | ? | 61.5 | **7.94** | 48.1 | 47.6 | 47.1 | 57.7 | **67.3** | 74.9 | **70.1** |
117
+ | | | | | | | | | | | |
118
+ | OpenHermes 2.5 | 7B | 59.3 | 7.54 | 48.2 | 49.4 | 46.5 | 57.5 | 63.8 | 73.5 | 59.9 |
119
+ | OpenOrca Mistral | 7B | 52.7 | 6.86 | 38.4 | 49.4 | 42.9 | 45.9 | 59.3 | 59.1 | 58.1 |
120
+ | Zephyr-Ξ²^ | 7B | 34.6 | 7.34 | 22.0 | 40.6 | 39.0 | 40.8 | 39.8 | 5.1 | 16.0 |
121
+ | Mistral | 7B | - | 6.84 | 30.5 | 39.0 | 38.0 | - | 60.1 | 52.2 | - |
 
 
 
122
  ## 𝕏 Comparison with [X.AI Grok](https://x.ai/)
123
 
124
+ | | License | # Param | Average | MMLU | HumanEval | MATH | GSM8k |
125
+ |-------------------|-------------|---------|----------|------|-----------|----------|----------|
126
+ | OpenChat 3.5 1210 | Apache-2.0 | **7B** | **60.1** | 65.3 | **68.9** | **28.9** | **77.3** |
127
+ | OpenChat 3.5 | Apache-2.0 | **7B** | 56.4 | 64.3 | 55.5 | 28.6 | **77.3** |
128
+ | Grok-0 | Proprietary | 33B | 44.5 | 65.7 | 39.7 | 15.7 | 56.8 |
129
+ | Grok-1 | Proprietary | ???B | 55.8 | 73 | 63.2 | 23.9 | 62.9 |
130
 
131
  # πŸ’ŒContact
132