File size: 3,692 Bytes
d99c914
 
94711c5
 
 
7ceed3f
d99c914
 
94711c5
d99c914
 
94711c5
d99c914
94711c5
 
 
 
 
 
d99c914
 
94711c5
d99c914
94711c5
d99c914
94711c5
d99c914
94711c5
57ee99d
94711c5
 
 
d99c914
94711c5
d99c914
b5155cf
94711c5
d99c914
 
94711c5
d99c914
94711c5
d99c914
94711c5
d99c914
94711c5
 
 
d99c914
94711c5
d99c914
94711c5
 
 
 
d99c914
94711c5
 
d99c914
94711c5
d99c914
94711c5
 
 
 
fde3327
94711c5
d99c914
94711c5
1eac05f
 
696f584
 
1eac05f
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d99c914
94711c5
d99c914
10514fa
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
---
library_name: transformers
tags:
- bitnet
- falcon3
base_model: tiiuae/Falcon3-7B-Base
---

![image/png](https://cdn-uploads.huggingface.co/production/uploads/62441d1d9fdefb55a0b7d12c/c-tosr0FvMlKuKQTojx_6.png)


#  Table of Contents

0. [TL;DR](#TL;DR)
1. [Model Details](#model-details)
2. [Training Details](#training-details)
3. [Usage](#usage)
4. [Evaluation](#evaluation)
5. [Citation](#citation)


# TL;DR

# Model Details

## Model Description

- **Developed by:** [https://www.tii.ae](https://www.tii.ae)
- **Model type:** Causal decoder-only
- **Architecture:** Pure-transformer - 1.58bit version
- **Language(s) (NLP):** Mainly English
- **License:** TII Falcon License 2.0

# Training details

The model has been trained following the training strategies from the recent [1-bit LLM HF blogpost](https://huggingface.co/blog/1_58_llm_extreme_quantization) and [1-bit LLM paper](https://huggingface.co/papers/2402.17764).
For more details about the training protocol of this model, please refer to the Falcon-3 technical report, section *Compression*.


# Usage

Currently to use this model you can either rely on Hugging Face transformers library or [BitNet](https://github.com/microsoft/BitNet) library. You can also play with the model using the [falcon-1.58bit playground](https://huggingface.co/spaces/tiiuae/falcon3-1.58bit-playground) (only for the 7B instruct version).

## 🤗 transformers

```python
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "tiiuae/Falcon3-7B-Base-1.58bit"

model = AutoModelForCausalLM.from_pretrained(
  model_id,
  torch_dtype=torch.bfloat16,
).to("cuda")

# Perform text generation
```

## BitNet

```
git clone https://github.com/microsoft/BitNet && cd BitNet
pip install -r requirements.txt
python setup_env.py --hf-repo tiiuae/Falcon3-7B-Base-1.58bit -q i2_s
python run_inference.py -m models/Falcon3-7B-1.58bit/ggml-model-i2_s.gguf -p "Hi how are you doing today?" -cnv
```

# Evaluation
We report in the following table our internal pipeline benchmarks:

**Note evaluation results are normalized score from v2 leaderboard tasks - reported results of original models in the blogpost are raw scores**

<table border="1" style="width: 100%; text-align: center; border-collapse: collapse;">
    <colgroup>
        <col style="width: 10%;">
        <col style="width: 10%;">
        <col style="background-color: rgba(80, 15, 213, 0.5); width: 7%;">
    </colgroup>
    <thead>
        <tr>
            <th>Benchmark</th>
            <th>Llama3-8B-1.58-100B-tokens</th>
            <th>Falcon3-7B-Base-1.58bit </th>
        </tr>
    </thead>
    <tbody>
        <tr>
            <td>IFEval</td>
            <td>17.91</td>
            <td><b>25.43</b></td>
        </tr>      
        <tr>
            <td>MUSR</td>
            <td>4.87</td>
            <td><b>5.75</b></td>
        </tr>
        <tr>
            <td>GPQA</td>
            <td>1.83</td>
            <td><b>2.32</b></td>
        </tr>
        <tr>
            <td>BBH</td>
            <td><b>5.36</b></td>
            <td>3.91</td>
        </tr>
        <tr>
            <td>MMLU-PRO</td>
            <td><b>2.78</b></td>
            <td>1.36</td>
        </tr>      
        <tr>
            <td>MATH</td>
            <td>0.26</td>
            <td><b>0.88</b></td>
        </tr>
        <tr>
            <td>Average</td>
            <td>5.5</td>
            <td><b>6.61</b></td>
        </tr>          
    </tbody>
</table>

# Citation

```
@misc{Falcon3,
    title = {The Falcon 3 Family of Open Models},
    url = {https://huggingface.co/blog/falcon3},
    author = {Falcon-LLM Team},
    month = {December},
    year = {2024}
}
```