aashish1904 commited on
Commit
39c1a61
·
verified ·
1 Parent(s): 653bac4

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +386 -0
README.md ADDED
@@ -0,0 +1,386 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+
2
+ ---
3
+
4
+ base_model: tiiuae/Falcon3-10B-Base
5
+ library_name: transformers
6
+ license: other
7
+ license_name: falcon-llm-license
8
+ license_link: https://falconllm.tii.ae/falcon-terms-and-conditions.html
9
+ tags:
10
+ - falcon3
11
+ model-index:
12
+ - name: Falcon3-10B-Instruct
13
+ results:
14
+ - task:
15
+ type: text-generation
16
+ name: Text Generation
17
+ dataset:
18
+ name: IFEval (0-Shot)
19
+ type: HuggingFaceH4/ifeval
20
+ args:
21
+ num_few_shot: 0
22
+ metrics:
23
+ - type: inst_level_strict_acc and prompt_level_strict_acc
24
+ value: 78.17
25
+ name: strict accuracy
26
+ source:
27
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
28
+ name: Open LLM Leaderboard
29
+ - task:
30
+ type: text-generation
31
+ name: Text Generation
32
+ dataset:
33
+ name: BBH (3-Shot)
34
+ type: BBH
35
+ args:
36
+ num_few_shot: 3
37
+ metrics:
38
+ - type: acc_norm
39
+ value: 44.82
40
+ name: normalized accuracy
41
+ source:
42
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
43
+ name: Open LLM Leaderboard
44
+ - task:
45
+ type: text-generation
46
+ name: Text Generation
47
+ dataset:
48
+ name: MATH Lvl 5 (4-Shot)
49
+ type: hendrycks/competition_math
50
+ args:
51
+ num_few_shot: 4
52
+ metrics:
53
+ - type: exact_match
54
+ value: 25.91
55
+ name: exact match
56
+ source:
57
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
58
+ name: Open LLM Leaderboard
59
+ - task:
60
+ type: text-generation
61
+ name: Text Generation
62
+ dataset:
63
+ name: GPQA (0-shot)
64
+ type: Idavidrein/gpqa
65
+ args:
66
+ num_few_shot: 0
67
+ metrics:
68
+ - type: acc_norm
69
+ value: 10.51
70
+ name: acc_norm
71
+ source:
72
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
73
+ name: Open LLM Leaderboard
74
+ - task:
75
+ type: text-generation
76
+ name: Text Generation
77
+ dataset:
78
+ name: MuSR (0-shot)
79
+ type: TAUR-Lab/MuSR
80
+ args:
81
+ num_few_shot: 0
82
+ metrics:
83
+ - type: acc_norm
84
+ value: 13.61
85
+ name: acc_norm
86
+ source:
87
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
88
+ name: Open LLM Leaderboard
89
+ - task:
90
+ type: text-generation
91
+ name: Text Generation
92
+ dataset:
93
+ name: MMLU-PRO (5-shot)
94
+ type: TIGER-Lab/MMLU-Pro
95
+ config: main
96
+ split: test
97
+ args:
98
+ num_few_shot: 5
99
+ metrics:
100
+ - type: acc
101
+ value: 38.1
102
+ name: accuracy
103
+ source:
104
+ url: https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard?query=tiiuae/Falcon3-10B-Instruct
105
+ name: Open LLM Leaderboard
106
+
107
+ ---
108
+
109
+ [![QuantFactory Banner](https://lh7-rt.googleusercontent.com/docsz/AD_4nXeiuCm7c8lEwEJuRey9kiVZsRn2W-b4pWlu3-X534V3YmVuVc2ZL-NXg2RkzSOOS2JXGHutDuyyNAUtdJI65jGTo8jT9Y99tMi4H4MqL44Uc5QKG77B0d6-JfIkZHFaUA71-RtjyYZWVIhqsNZcx8-OMaA?key=xt3VSDoCbmTY7o-cwwOFwQ)](https://hf.co/QuantFactory)
110
+
111
+
112
+ # QuantFactory/Falcon3-10B-Instruct-GGUF
113
+ This is quantized version of [tiiuae/Falcon3-10B-Instruct](https://huggingface.co/tiiuae/Falcon3-10B-Instruct) created using llama.cpp
114
+
115
+ # Original Model Card
116
+
117
+
118
+ <div align="center">
119
+ <img src="https://huggingface.co/datasets/tiiuae/documentation-images/resolve/main/general/falco3-logo.png" alt="drawing" width="500"/>
120
+ </div>
121
+
122
+ # Falcon3-10B-Instruct
123
+
124
+ **Falcon3** family of Open Foundation Models is a set of pretrained and instruct LLMs ranging from 1B to 10B parameters.
125
+
126
+ This repository contains the **Falcon3-10B-Instruct**. It achieves state-of-the-art results (at the time of release) on reasoning, language understanding, instruction following, code and mathematics tasks.
127
+ Falcon3-10B-Instruct supports 4 languages (English, French, Spanish, Portuguese) and a context length of up to 32K.
128
+
129
+
130
+ ## Model Details
131
+ - Architecture
132
+ - Transformer-based causal decoder-only architecture
133
+ - 40 decoder blocks
134
+ - Grouped Query Attention (GQA) for faster inference: 12 query heads and 4 key-value heads
135
+ - Wider head dimension: 256
136
+ - High RoPE value to support long context understanding: 1000042
137
+ - Uses SwiGLu and RMSNorm
138
+ - 32K context length
139
+ - 131K vocab size
140
+ - Depth up-scaled from **Falcon3-7B-Base** with 2 Teratokens of datasets comprising of web, code, STEM, high quality and mutlilingual data using 1024 H100 GPU chips
141
+ - Posttrained on 1.2 million samples of STEM, conversational, code, safety and function call data
142
+ - Supports EN, FR, ES, PT
143
+ - Developed by [Technology Innovation Institute](https://www.tii.ae)
144
+ - License: TII Falcon-LLM License 2.0
145
+ - Model Release Date: December 2024
146
+
147
+
148
+ ## Getting started
149
+
150
+ <details>
151
+ <summary> Click to expand </summary>
152
+
153
+ ```python
154
+ from transformers import AutoTokenizer, AutoModelForCausalLM
155
+
156
+
157
+ from transformers import AutoModelForCausalLM, AutoTokenizer
158
+
159
+ model_name = "tiiuae/Falcon3-10B-Instruct"
160
+
161
+ model = AutoModelForCausalLM.from_pretrained(
162
+ model_name,
163
+ torch_dtype="auto",
164
+ device_map="auto"
165
+ )
166
+ tokenizer = AutoTokenizer.from_pretrained(model_name)
167
+
168
+ prompt = "How many hours in one day?"
169
+ messages = [
170
+ {"role": "system", "content": "You are a helpful friendly assistant Falcon3 from TII, try to follow instructions as much as possible."},
171
+ {"role": "user", "content": prompt}
172
+ ]
173
+ text = tokenizer.apply_chat_template(
174
+ messages,
175
+ tokenize=False,
176
+ add_generation_prompt=True
177
+ )
178
+ model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
179
+
180
+ generated_ids = model.generate(
181
+ **model_inputs,
182
+ max_new_tokens=1024
183
+ )
184
+ generated_ids = [
185
+ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids)
186
+ ]
187
+
188
+ response = tokenizer.batch_decode(generated_ids, skip_special_tokens=True)[0]
189
+ print(response)
190
+ ```
191
+
192
+ </details>
193
+
194
+ <br>
195
+
196
+ ## Benchmarks
197
+ We report in the following table our internal pipeline benchmarks.
198
+ - We use [lm-evaluation harness](https://github.com/EleutherAI/lm-evaluation-harness).
199
+ - We report **raw scores** obtained by applying chat template **without fewshot_as_multiturn** (unlike Llama3.1).
200
+ - We use same batch-size across all models.
201
+
202
+
203
+
204
+ <table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
205
+ <colgroup>
206
+ <col style="width: 10%;">
207
+ <col style="width: 10%;">
208
+ <col style="width: 7%;">
209
+ <col style="width: 7%;">
210
+ <col style="background-color: rgba(80, 15, 213, 0.5); width: 7%;">
211
+ </colgroup>
212
+ <thead>
213
+ <tr>
214
+ <th>Category</th>
215
+ <th>Benchmark</th>
216
+ <th>Yi-1.5-9B-Chat</th>
217
+ <th>Mistral-Nemo-Base-2407 (12B)</th>
218
+ <th>Falcon3-10B-Instruct</th>
219
+ </tr>
220
+ </thead>
221
+ <tbody>
222
+ <tr>
223
+ <td rowspan="3">General</td>
224
+ <td>MMLU (5-shot)</td>
225
+ <td>70</td>
226
+ <td>65.9</td>
227
+ <td><b>71.6</td>
228
+ </tr>
229
+ <tr>
230
+ <td>MMLU-PRO (5-shot)</td>
231
+ <td>39.6</td>
232
+ <td>32.7</td>
233
+ <td><b>44</td>
234
+ </tr>
235
+ <tr>
236
+ <td>IFEval</td>
237
+ <td>57.6</td>
238
+ <td>63.4</td>
239
+ <td><b>78</td>
240
+ </tr>
241
+ <tr>
242
+ <td rowspan="3">Math</td>
243
+ <td>GSM8K (5-shot)</td>
244
+ <td>76.6</td>
245
+ <td>73.8</td>
246
+ <td><b>83.1</td>
247
+ </tr>
248
+ <tr>
249
+ <td>GSM8K (8-shot, COT)</td>
250
+ <td>78.5</td>
251
+ <td>73.6</td>
252
+ <td><b>81.3</td>
253
+ </tr>
254
+ <tr>
255
+ <td>MATH Lvl-5 (4-shot)</td>
256
+ <td>8.8</td>
257
+ <td>0.4</td>
258
+ <td><b>22.1</td>
259
+ </tr>
260
+ <tr>
261
+ <td rowspan="5">Reasoning</td>
262
+ <td>Arc Challenge (25-shot)</td>
263
+ <td>51.9</td>
264
+ <td>61.6</td>
265
+ <td><b>64.5</td>
266
+ </tr>
267
+ <tr>
268
+ <td>GPQA (0-shot)</td>
269
+ <td><b>35.4</td>
270
+ <td>33.2</td>
271
+ <td>33.5</td>
272
+ </tr>
273
+ <tr>
274
+ <td>GPQA (0-shot, COT)</td>
275
+ <td>16</td>
276
+ <td>12.7</td>
277
+ <td><b>32.6</td>
278
+ </tr>
279
+ <tr>
280
+ <td>MUSR (0-shot)</td>
281
+ <td><b>41.9</td>
282
+ <td>38.1</td>
283
+ <td>41.1</td>
284
+ </tr>
285
+ <tr>
286
+ <td>BBH (3-shot)</td>
287
+ <td>49.2</td>
288
+ <td>43.6</td>
289
+ <td><b>58.4</td>
290
+ </tr>
291
+ <tr>
292
+ <td rowspan="4">CommonSense Understanding</td>
293
+ <td>PIQA (0-shot)</td>
294
+ <td>76.4</td>
295
+ <td>78.2</td>
296
+ <td><b>78.4</td>
297
+ </tr>
298
+ <tr>
299
+ <td>SciQ (0-shot)</td>
300
+ <td>61.7</td>
301
+ <td>76.4</td>
302
+ <td><b>90.4</td>
303
+ </tr>
304
+ <tr>
305
+ <td>Winogrande (0-shot)</td>
306
+ <td>-</td>
307
+ <td>-</td>
308
+ <td>71.3</td>
309
+ </tr>
310
+ <tr>
311
+ <td>OpenbookQA (0-shot)</td>
312
+ <td>43.2</td>
313
+ <td>47.4</td>
314
+ <td><b>48.2</td>
315
+ </tr>
316
+ <tr>
317
+ <td rowspan="2">Instructions following</td>
318
+ <td>MT-Bench (avg)</td>
319
+ <td>8.28</td>
320
+ <td><b>8.6</td>
321
+ <td>8.17</td>
322
+ </tr>
323
+ <tr>
324
+ <td>Alpaca (WC)</td>
325
+ <td>25.81</td>
326
+ <td><b>45.44</td>
327
+ <td>24.7</td>
328
+ </tr>
329
+ <tr>
330
+ <td>Tool use</td>
331
+ <td>BFCL AST (avg)</td>
332
+ <td>48.4</td>
333
+ <td>74.2</td>
334
+ <td><b>86.3</td>
335
+ </tr>
336
+ <tr>
337
+ <td rowspan="2">Code</td>
338
+ <td>EvalPlus (0-shot) (avg)</td>
339
+ <td>69.4</td>
340
+ <td>58.9</td>
341
+ <td><b>74.7</b></td>
342
+ </tr>
343
+ <tr>
344
+ <td>Multipl-E (0-shot) (avg)</td>
345
+ <td>-</td>
346
+ <td>34.5</td>
347
+ <td><b>45.8</b></td>
348
+ </tr>
349
+ </tbody>
350
+ </table>
351
+
352
+ ## Useful links
353
+ - View our [release blogpost](https://huggingface.co/blog/falcon3).
354
+ - Feel free to join [our discord server](https://discord.gg/fwXpMyGc) if you have any questions or to interact with our researchers and developers.
355
+
356
+ ## Technical Report
357
+
358
+ Coming soon....
359
+
360
+ ## Citation
361
+ If Falcon3 family were helpful in your work, feel free to give us a cite.
362
+
363
+ ```
364
+ @misc{Falcon3,
365
+ title = {The Falcon 3 family of Open Models},
366
+ author = {TII Team},
367
+ month = {December},
368
+ year = {2024}
369
+ }
370
+ ```
371
+
372
+
373
+ # [Open LLM Leaderboard Evaluation Results](https://huggingface.co/spaces/open-llm-leaderboard/open_llm_leaderboard)
374
+ Detailed results can be found [here](https://huggingface.co/datasets/open-llm-leaderboard/tiiuae__Falcon3-10B-Instruct-details)
375
+
376
+ | Metric |Value|
377
+ |-------------------|----:|
378
+ |Avg. |35.19|
379
+ |IFEval (0-Shot) |78.17|
380
+ |BBH (3-Shot) |44.82|
381
+ |MATH Lvl 5 (4-Shot)|25.91|
382
+ |GPQA (0-shot) |10.51|
383
+ |MuSR (0-shot) |13.61|
384
+ |MMLU-PRO (5-shot) |38.10|
385
+
386
+