--- language: - multilingual license: apache-2.0 --- # Model Card for Sindibad-7B # Table of Contents 0. [TL;DR](#TL;DR) 1. [Model Details](#model-details) 2. [Usage](#usage) 3. [Training Details](#training-details) 4. [Evaluation](#evaluation) # TL;DR # Model Details ## Model Description - **Model type:** Language model - **Language(s) (NLP):** English - **License:** Apache 2.0 # Usage Find below some example scripts on how to use the model in `transformers` (Make sure to have the latest transformers, or the one built from source): ## Using the Pytorch model ### Running the model on a CPU
Click to expand ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/sindibad-7b") model = AutoModelForCausalLM.from_pretrained("tiiuae/sindibad-7b") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
### Running the model on a GPU
Click to expand ```python # pip install accelerate from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/sindibad-7b") model = AutoModelForCausalLM.from_pretrained("tiiuae/sindibad-7b", device_map="auto") input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
### Running the model on a GPU using different precisions #### FP16
Click to expand ```python # pip install accelerate import torch from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/sindibad-7b") model = AutoModelForCausalLM.from_pretrained("tiiuae/sindibad-7b", device_map="auto", torch_dtype=torch.float16) input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
#### INT8
Click to expand ```python # pip install bitsandbytes accelerate from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("tiiuae/sindibad-7b") model = AutoModelForCausalLM.from_pretrained("tiiuae/sindibad-7b", device_map="auto", load_in_8bit=True) input_text = "Question: How many hours in one day? Answer: " input_ids = tokenizer(input_text, return_tensors="pt").input_ids.to("cuda") outputs = model.generate(input_ids) print(tokenizer.decode(outputs[0])) ```
# Training Details ## Training Data Jingwei ## Training Procedure Maksim # Evaluation ## Results Ilyas