--- title: README emoji: ๐Ÿš€ colorFrom: red colorTo: indigo sdk: static pinned: false --- **Do you believe in a better tomorrow? We do. Our team of expert researchers live the dream and work to build it every day.** **๐Ÿ”ฅ [Falcon-180B](https://huggingface.co/tiiuae/falcon-180b) is now available in open-access! [Try it now in our chat demo!](https://huggingface.co/spaces/tiiuae/falcon-180b-demo)** # News * ๐Ÿ’ฅ **TII has open-sourced Falcon-180B for research and commercial utilization!** Access the [180B](https://huggingface.co/tiiuae/falcon-180b), as well as [7B](https://huggingface.co/tiiuae/falcon-7b)/[40B](https://huggingface.co/tiiuae/falcon-40b) models, and explore our high-quality web dataset, [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb). * โœจ **Falcon-[40B](https://huggingface.co/tiiuae/falcon-40b)/[7B](https://huggingface.co/tiiuae/falcon-7b) are now available under the Apache 2.0 license**, TII has [waived all royalties and commercial usage restrictions](https://www.tii.ae/news/uaes-falcon-40b-worlds-top-ranked-ai-model-technology-innovation-institute-now-royalty-free). # Falcon LLM Falcon LLM is TII's flagship series of large language models, built from scratch using a custom data pipeline and distributed training library. Papers coming soon ๐Ÿ˜Š. To promote collaborations and drive innovation, we have open-sourced a number of artefacts: * The **Falcon-180B** pretrained and chat models, under the [Falcon-180B TII license](https://huggingface.co/spaces/tiiuae/falcon-180b-license/blob/main/LICENSE.txt). Falcon-180B is the largest and most powerful open-access model available. * The **Falcon-7/40B** pretrained and instruct models, under the Apache 2.0 software license . Falcon-7B/40B models are state-of-the-art for their size, outperforming other open-source models on NLP benchmarks. * The **RefinedWeb** dataset, a massive web dataset with stringent filtering and large-scale deduplication, enabling models trained on web data alone to match or outperform models trained on curated corpora. See ๐Ÿ““ [the paper](https://arxiv.org/abs/2306.01116) for more information. RefinedWeb is licensed under ODC-By 1.0. See below for a detailed list of artefacts in the Falcon LLM family: | **Artefact** | **Link** | **Type** | **Details** | |---------------------|------------------------------------------------------------------|-------------------------|-------------------------------------------------------------------| | ๐Ÿฅ‡ **Falcon-40B** | [Here](https://huggingface.co/tiiuae/falcon-180b) | *pretrained model* | 180B parameters trained on 3,500 billion tokens. | | Falcon-180B-Chat | [Here](https://huggingface.co/tiiuae/falcon-180b-chat) | *chat model* | Falcon-180B finetuned on a mixture of [Ultrachat](https://huggingface.co/datasets/stingning/ultrachat), [Platypus](https://huggingface.co/datasets/garage-bAInd/Open-Platypus) and [Airoboros](https://huggingface.co/datasets/jondurbin/airoboros-2.1). | | ๐Ÿฅˆ **Falcon-40B** | [Here](https://huggingface.co/tiiuae/falcon-40b) | *pretrained model* | 40B parameters trained on 1,000 billion tokens. | | Falcon-40B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-40b-instruct) | *instruction/chat model* | Falcon-40B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot) dataset. | | ๐Ÿฅ‰ **Falcon-7B** | [Here](https://huggingface.co/tiiuae/falcon-7b) | *pretrained model* | 6.7B parameters trained on 1,500 billion tokens. | | Falcon-7B-Instruct | [Here](https://huggingface.co/tiiuae/falcon-7b-instruct) | *instruction/chat model* | Falcon-7B finetuned on the [Baize](https://github.com/project-baize/baize-chatbot), [GPT4All](https://github.com/nomic-ai/gpt4all), and [GPTeacher](https://github.com/teknium1/GPTeacher) datasets. | | ๐Ÿ“€ **RefinedWeb** | [Here](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) | *pretraining web dataset* | ~600 billion "high-quality" tokens. | | Falcon-RW-1B | [Here](https://huggingface.co/tiiuae/falcon-rw-1b) | *pretrained model* | 1.3B parameters trained on 350 billion tokens. | | Falcon-RW-7B | [Here](https://huggingface.co/tiiuae/falcon-rw-7b) | *pretrained model* | 7.5B parameters trained on 350 billion tokens. | # About us The [Technology Innovation Institute](https://www.tii.ae) (TII) is a leading global research center dedicated to pushing the frontiers of knowledge. Our teams of scientists, researchers and engineers work in an open, flexible and agile environment to deliver discovery science and transformative technologies. Our work means we will not only prepare for the future; we will create it. Working together, we are committed to inspiring innovation for a better tomorrow. We are part of Abu Dhabi Governmentโ€™s Advanced Technology Research Council, which oversees technology research in the emirate. As a disruptor in science, we are setting new standards and serve as a catalyst for change. Faced with a future of limitless possibilities and supported by strategically funded investments, we are encouraging a culture of discovery. Our work reinforces Abu Dhabi and the UAEโ€™s status as an R&D hub and a global leader in breakthrough technologies.