Saily 220B


Announcements

1. Date: 17th December, 2023 Releasing v1. Saily_220B is a powerful AI model built on top of Llama2-70B merges. We created 10 fine-tuned Llama2 70B models. The models were fine-tuned on a part of Refined-Web Dataset (common for all) and individually the models were finetuned on niche specific datasets:

  • Code
  • Humor
  • Maths
  • Logical Understanding
  • Physics
  • Reasoning
  • Psychology
  • Roleplay

We created 4 linear merges while keeping Logical-Understanding and Reasoning models constant in all linear merges. and then finally we created a passthrough merge between the models.

Public Datasets used:

  1. RefinedWeb (part of it)
  2. Pile (part of it)
  3. MetaMathQA
  4. Unnatural Code (Javascript, Python, C++)

How did we create the private dataset?

We recorded many internal brain-storming sessions where we just talked about random things. We also invited many experts from different fields:

  • Mathematicians
  • Developers
  • Bio-Engineers
  • Authors
  • Psychologists
  • and others...

We talked about different things with them and recorded the sessions and then transcribed the audio to create the datasets.


Please don't refer to the config.json in the files, it isn't accurate. You can run:

from transformers import AutoModelForCausalLM as amclm
model = amclm.from_pretrained("deepnight-research/saily_220b",
    device_map="auto")

# print(model.config)
model.config

to check out the model's configuration.


Try it:

You definitely need GPUs here (that goes without saying)

  • We have tried it on 4 x A100 80GB and 2 x A100 80GB.
  • You will have to load the model in 4bit to fit on 2 x A100 (80GB).
from transformers import AutoModelForCausalLM as amclm
from transformers import AutoTokenizer

model_name = "deepnight-research/saily_220b"
model = amclm.from_pretrained(model_name, device_map="auto")

# To load in 8Bit, make sure you have bitsandbytes installed.
# model = amclm.from_pretrained(model_name,
#           device_map="auto",
#           load_in_8bit=True
#        )

# Float16
# import torch
# model = amclm.from_pretrained(model_name,
#            device_map="auto",
#            torch_dtype=torch.float16
#         )

tokenizer = AutoTokenier.from_pretrained(model_name)

input_ids = tokenizer.encode("[INST]\nWrite a poem about cats\n[/INST]\n\n", return_tensors="pt")

output = model.generate(input_ids, max_length=128,
            temperature=0.7, 
            repetition_penalty=1.1, 
            top_p=0.7, top_k=50
        )

output_text = tokenizer.decode(output[0], skip_special_tokens=True)

We recommend following Alpaca Prompt Format, and if you're trying it out in Text-Generation-WebUI, please use INSTRUCT or CHAT-INSTRUCT mode.


Limitations and Bias

As with all language models, Saily_220B may generate incorrect or biased content. It's important to keep this in mind when using the model.


Wanna Talk?

Reach out to us at [email protected] or [email protected]

Downloads last month
1,852
Safetensors
Model size
208B params
Tensor type
FP16
·
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.

Model tree for deepnight-research/Saily_220B

Finetunes
2 models
Quantizations
2 models

Datasets used to train deepnight-research/Saily_220B

Collection including deepnight-research/Saily_220B