help

#1
by Abhaykoul - opened

how to use it from transformers

It should be exactly the same as mixtral.

Can you please give transformer code to run it

from transformers import AutoModelForCausalLM, AutoTokenizer
device = "cuda" # the device to load the model onto

model = AutoModelForCausalLM.from_pretrained("eastwind/tinymix-8x1b-chat")
tokenizer = AutoTokenizer.from_pretrained("eastwind/tinymix-8x1b-chat")

prompt = "My favourite condiment is"

model_inputs = tokenizer([prompt], return_tensors="pt").to(device)
model.to(device)

generated_ids = model.generate(**model_inputs, max_new_tokens=100, do_sample=True)
tokenizer.batch_decode(generated_ids)[0]
"The expected output"

taken from https://huggingface.co/docs/transformers/model_doc/mixtral

you also might want to format the prompt into chatml like I wrote in the readme

srinivasbilla changed discussion status to closed

Sign up or log in to comment