File size: 3,459 Bytes

---
license: llama2
datasets:
- tiiuae/falcon-refinedweb
- EleutherAI/pile
- meta-math/MetaMathQA
language:
- en
library_name: transformers

---
# Saily 220B
<img src="https://i.ibb.co/rG8S6cF/Saily-220-B.png" style="width: 100%; height: auto;"/>

---
## Announcements
**1.** <b>Date: </b>17th December, 2023
Releasing v1. Saily_220B is a powerful AI model built on top of Llama2-70B merges.
We created 10 fine-tuned **Llama2 70B** models. The models were fine-tuned on a part of Refined-Web Dataset (common for all)
and individually the models were finetuned on niche specific datasets:
- Code
- Humor
- Maths
- Logical Understanding
- Physics
- Reasoning
- Psychology
- Roleplay

We created 4 linear merges while keeping **Logical-Understanding** and **Reasoning** models constant in all linear merges.
and then finally we created a passthrough merge between the models.

Public Datasets used:
1. [RefinedWeb](https://hf.co/datasets/tiiuae/falcon-refinedweb) (part of it)
2. Pile (part of it)
3. [MetaMathQA](https://hf.co/datasets/meta-math/MetaMathQA)
4. Unnatural Code (Javascript, Python, C++)

### How did we create the private dataset?
We recorded many internal brain-storming sessions where we just talked about random things.
We also invited many experts from different fields: 
 - Mathematicians
 - Developers
 - Bio-Engineers
 - Authors
 - Psychologists
 - and others...

We talked about different things with them and recorded the sessions and then transcribed the audio to create the datasets.

---

### Please don't refer to the config.json in the files, it isn't accurate. You can run:
```python
from transformers import AutoModelForCausalLM as amclm
model = amclm.from_pretrained("deepnight-research/saily_220b",
    device_map="auto")

# print(model.config)
model.config
```
to check out the model's configuration.

---


### Try it:

You definitely need GPUs here (that goes without saying)
* We have tried it on **4 x A100 80GB** and **2 x A100 80GB**.
* You will have to load the model in **4bit** to fit on **2 x A100 (80GB)**.

```python
from transformers import AutoModelForCausalLM as amclm
from transformers import AutoTokenizer

model_name = "deepnight-research/saily_220b"
model = amclm.from_pretrained(model_name, device_map="auto")

# To load in 8Bit, make sure you have bitsandbytes installed.
# model = amclm.from_pretrained(model_name,
#           device_map="auto",
#           load_in_8bit=True
#        )

# Float16
# import torch
# model = amclm.from_pretrained(model_name,
#            device_map="auto",
#            torch_dtype=torch.float16
#         )

tokenizer = AutoTokenier.from_pretrained(model_name)

input_ids = tokenizer.encode("[INST]\nWrite a poem about cats\n[/INST]\n\n", return_tensors="pt")

output = model.generate(input_ids, max_length=128,
            temperature=0.7, 
            repetition_penalty=1.1, 
            top_p=0.7, top_k=50
        )

output_text = tokenizer.decode(output[0], skip_special_tokens=True)
```

We recommend following **Alpaca Prompt Format**, and if you're trying it out in Text-Generation-WebUI, please use **INSTRUCT** or **CHAT-INSTRUCT** mode.


---

## Limitations and Bias
As with all language models, Saily_220B may generate incorrect or biased content. It's important to keep this in mind when using the model.

---

## Wanna Talk?
Reach out to us at [[email protected]](mailto:[email protected]) or [[email protected]](mailto:[email protected])