deepnight-research
/

Saily_220B

Text Generation

text-generation-inference

Inference Endpoints

Model card Files Files and versions Community

Saily_220B / README.md

deepnight-research's picture

deepnight-research

fix typo (#3)

6ab5791 11 months ago

|

3.46 kB

	---
	license: llama2
	datasets:
	- tiiuae/falcon-refinedweb
	- EleutherAI/pile
	- meta-math/MetaMathQA
	language:
	- en
	library_name: transformers

	---
	# Saily 220B
	<img src="https://i.ibb.co/rG8S6cF/Saily-220-B.png" style="width: 100%; height: auto;"/>

	---
	## Announcements
	1. <b>Date: </b>17th December, 2023
	Releasing v1. Saily_220B is a powerful AI model built on top of Llama2-70B merges.
	We created 10 fine-tuned Llama2 70B models. The models were fine-tuned on a part of Refined-Web Dataset (common for all)
	and individually the models were finetuned on niche specific datasets:
	- Code
	- Humor
	- Maths
	- Logical Understanding
	- Physics
	- Reasoning
	- Psychology
	- Roleplay

	We created 4 linear merges while keeping Logical-Understanding and Reasoning models constant in all linear merges.
	and then finally we created a passthrough merge between the models.

	Public Datasets used:
	1. [RefinedWeb](https://hf.co/datasets/tiiuae/falcon-refinedweb) (part of it)
	2. Pile (part of it)
	3. [MetaMathQA](https://hf.co/datasets/meta-math/MetaMathQA)
	4. Unnatural Code (Javascript, Python, C++)

	### How did we create the private dataset?
	We recorded many internal brain-storming sessions where we just talked about random things.
	We also invited many experts from different fields:
	- Mathematicians
	- Developers
	- Bio-Engineers
	- Authors
	- Psychologists
	- and others...

	We talked about different things with them and recorded the sessions and then transcribed the audio to create the datasets.

	---

	### Please don't refer to the config.json in the files, it isn't accurate. You can run:
	```python
	from transformers import AutoModelForCausalLM as amclm
	model = amclm.from_pretrained("deepnight-research/saily_220b",
	device_map="auto")

	# print(model.config)
	model.config
	```
	to check out the model's configuration.

	---


	### Try it:

	You definitely need GPUs here (that goes without saying)
	* We have tried it on 4 x A100 80GB and 2 x A100 80GB.
	* You will have to load the model in 4bit to fit on 2 x A100 (80GB).

	```python
	from transformers import AutoModelForCausalLM as amclm
	from transformers import AutoTokenizer

	model_name = "deepnight-research/saily_220b"
	model = amclm.from_pretrained(model_name, device_map="auto")

	# To load in 8Bit, make sure you have bitsandbytes installed.
	# model = amclm.from_pretrained(model_name,
	# device_map="auto",
	# load_in_8bit=True
	# )

	# Float16
	# import torch
	# model = amclm.from_pretrained(model_name,
	# device_map="auto",
	# torch_dtype=torch.float16
	# )

	tokenizer = AutoTokenier.from_pretrained(model_name)

	input_ids = tokenizer.encode("[INST]\nWrite a poem about cats\n[/INST]\n\n", return_tensors="pt")

	output = model.generate(input_ids, max_length=128,
	temperature=0.7,
	repetition_penalty=1.1,
	top_p=0.7, top_k=50
	)

	output_text = tokenizer.decode(output[0], skip_special_tokens=True)
	```

	We recommend following Alpaca Prompt Format, and if you're trying it out in Text-Generation-WebUI, please use INSTRUCT or CHAT-INSTRUCT mode.


	---

	## Limitations and Bias
	As with all language models, Saily_220B may generate incorrect or biased content. It's important to keep this in mind when using the model.

	---

	## Wanna Talk?
	Reach out to us at [[email protected]](mailto:[email protected]) or [[email protected]](mailto:[email protected])