amezasor commited on
Commit
3dd57b8
1 Parent(s): 1edd1f6

update: eval results and github repo link

Browse files
Files changed (1) hide show
  1. README.md +23 -25
README.md CHANGED
@@ -20,7 +20,7 @@ model-index:
20
  metrics:
21
  - name: pass@1
22
  type: pass@1
23
- value: 65.25
24
  veriefied: false
25
  - task:
26
  type: text-generation
@@ -30,7 +30,7 @@ model-index:
30
  metrics:
31
  - name: pass@1
32
  type: pass@1
33
- value: 33.13
34
  veriefied: false
35
  - task:
36
  type: text-generation
@@ -50,7 +50,7 @@ model-index:
50
  metrics:
51
  - name: pass@1
52
  type: pass@1
53
- value: 74.43
54
  veriefied: false
55
  - task:
56
  type: text-generation
@@ -90,7 +90,7 @@ model-index:
90
  metrics:
91
  - name: pass@1
92
  type: pass@1
93
- value: 81.03
94
  veriefied: false
95
  - task:
96
  type: text-generation
@@ -130,7 +130,7 @@ model-index:
130
  metrics:
131
  - name: pass@1
132
  type: pass@1
133
- value: 52.73
134
  veriefied: false
135
  - task:
136
  type: text-generation
@@ -150,7 +150,17 @@ model-index:
150
  metrics:
151
  - name: pass@1
152
  type: pass@1
153
- value: 64.25
 
 
 
 
 
 
 
 
 
 
154
  veriefied: false
155
  - task:
156
  type: text-generation
@@ -160,7 +170,7 @@ model-index:
160
  metrics:
161
  - name: pass@1
162
  type: pass@1
163
- value: 44.51
164
  veriefied: false
165
  - task:
166
  type: text-generation
@@ -180,7 +190,7 @@ model-index:
180
  metrics:
181
  - name: pass@1
182
  type: pass@1
183
- value: 61.87
184
  veriefied: false
185
  - task:
186
  type: text-generation
@@ -192,16 +202,6 @@ model-index:
192
  type: pass@1
193
  value: 29.28
194
  veriefied: false
195
- - task:
196
- type: text-generation
197
- dataset:
198
- type: multilingual
199
- name: MGSM
200
- metrics:
201
- - name: pass@1
202
- type: pass@1
203
- value: 51.60
204
- veriefied: false
205
  ---
206
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
207
 
@@ -211,15 +211,15 @@ model-index:
211
  **Granite-3.0-8B-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-8B-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 10 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
212
 
213
  - **Developers:** IBM Research
214
- - **GitHub Repository:** [ibm-granite/granite-language-models](https://github.com/ibm-granite/granite-language-models)
215
  - **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
216
- - **Paper:** [Granite Language Models](https://) <!-- TO DO: Update github repo link when it is ready -->
217
  - **Release Date**: October 21st, 2024
218
- - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0).
219
 
220
  <!-- de/es/fr/ja/pt/ar/cs/it/ko/nl/zh -->
221
  ## Supported Languages
222
- English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese (Simplified)
223
 
224
  ## Usage
225
  ### Intended use
@@ -258,8 +258,6 @@ output = tokenizer.batch_decode(output)
258
  print(output)
259
  ```
260
 
261
- <!-- ['Where is the MIT-IBM Watson AI Lab located?\n\nThe MIT-IBM Watson AI Lab is located in Cambridge, Massachusetts.\n\nWhat is the mission of the MIT-IBM Watson AI Lab?\n\nThe mission of the MIT-IBM Watson AI Lab is to advance the state of the art in artificial intelligence (AI) and machine learning (ML) through collaboration between MIT and IBM.\n\nWhat are some of the projects being worked on at the MIT-IBM Watson AI Lab?\n\nSome of the projects being worked on at the MIT-IBM Watson AI Lab include developing new AI and ML algorithms, applying AI and ML to real-world problems, and exploring the ethical implications of AI and ML.\n\nWhat is the significance of the MIT-IBM Watson AI Lab?<|endoftext|>'] -->
262
-
263
  ## Model Architeture
264
  **Granite-3.0-8B-Base** is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embbeddings.
265
 
@@ -303,4 +301,4 @@ The use of Large Language Models involves risks and ethical considerations peopl
303
  year = {2024},
304
  url = {https://arxiv.org/abs/0000.00000},
305
  }
306
- ```
 
20
  metrics:
21
  - name: pass@1
22
  type: pass@1
23
+ value: 65.54
24
  veriefied: false
25
  - task:
26
  type: text-generation
 
30
  metrics:
31
  - name: pass@1
32
  type: pass@1
33
+ value: 33.27
34
  veriefied: false
35
  - task:
36
  type: text-generation
 
50
  metrics:
51
  - name: pass@1
52
  type: pass@1
53
+ value: 80.90
54
  veriefied: false
55
  - task:
56
  type: text-generation
 
90
  metrics:
91
  - name: pass@1
92
  type: pass@1
93
+ value: 83.61
94
  veriefied: false
95
  - task:
96
  type: text-generation
 
130
  metrics:
131
  - name: pass@1
132
  type: pass@1
133
+ value: 63.40
134
  veriefied: false
135
  - task:
136
  type: text-generation
 
150
  metrics:
151
  - name: pass@1
152
  type: pass@1
153
+ value: 49.31
154
+ veriefied: false
155
+ - task:
156
+ type: text-generation
157
+ dataset:
158
+ type: reasoning
159
+ name: MUSR
160
+ metrics:
161
+ - name: pass@1
162
+ type: pass@1
163
+ value: 41.08
164
  veriefied: false
165
  - task:
166
  type: text-generation
 
170
  metrics:
171
  - name: pass@1
172
  type: pass@1
173
+ value: 52.44
174
  veriefied: false
175
  - task:
176
  type: text-generation
 
190
  metrics:
191
  - name: pass@1
192
  type: pass@1
193
+ value: 64.06
194
  veriefied: false
195
  - task:
196
  type: text-generation
 
202
  type: pass@1
203
  value: 29.28
204
  veriefied: false
 
 
 
 
 
 
 
 
 
 
205
  ---
206
  <!-- ![image/png](https://cdn-uploads.huggingface.co/production/uploads/62cd5057674cdb524450093d/1hzxoPwqkBJXshKVVe6_9.png) -->
207
 
 
211
  **Granite-3.0-8B-Base** is an open-source decoder-only language model from IBM Research that supports a variety of text-to-text generation tasks (e.g., question-answering, text-completion). **Granite-3.0-8B-Base** is trained from scratch and follows a two-phase training strategy. In the first phase, it is trained on 10 trillion tokens sourced from diverse domains. During the second phase, it is further trained on 2 trillion tokens using a carefully curated mix of high-quality data, aiming to enhance its performance on specific tasks.
212
 
213
  - **Developers:** IBM Research
214
+ - **GitHub Repository:** [ibm-granite/granite-3.0-language-models](https://github.com/ibm-granite/granite-3.0-language-models)
215
  - **Website**: [Granite Docs](https://www.ibm.com/granite/docs/)
216
+ - **Paper:** [Granite 3.0 Language Models]()
217
  - **Release Date**: October 21st, 2024
218
+ - **License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
219
 
220
  <!-- de/es/fr/ja/pt/ar/cs/it/ko/nl/zh -->
221
  ## Supported Languages
222
+ English, German, Spanish, French, Japanese, Portuguese, Arabic, Czech, Italian, Korean, Dutch, Chinese
223
 
224
  ## Usage
225
  ### Intended use
 
258
  print(output)
259
  ```
260
 
 
 
261
  ## Model Architeture
262
  **Granite-3.0-8B-Base** is based on a decoder-only dense transformer architecture. Core components of this architecture are: GQA and RoPE, MLP with SwiGLU, RMSNorm, and shared input/output embbeddings.
263
 
 
301
  year = {2024},
302
  url = {https://arxiv.org/abs/0000.00000},
303
  }
304
+ ```