--- license: apache-2.0 --- Using LoRA to finetune [bigsciene/bloom-1b7](https://huggingface.co/bigscience/bloom-1b7) model with [oasst1](https://huggingface.co/datasets/OpenAssistant/oasst1) data. Sample code to run ```python from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Zayt/bloom-1b7-lora-merged-oasst") model = AutoModelForCausalLM.from_pretrained("Zayt/bloom-1b7-lora-merged-oasst", device_map='auto', torch_dtype=torch.float16) prompt_format = "### Input:\n{human}\n\n### Response:\n" text = prompt_format.format(**{"human": "what is the weather today?"}) inputs = tokenizer(text, return_tensors='pt').to(model.device) input_length = inputs.input_ids.shape[1] with torch.no_grad(): outputs = model.generate( **inputs, max_new_tokens=400, do_sample=True, temperature=0.5, top_k=50, return_dict_in_generate=True, no_repeat_ngram_size=5, pad_token_id=tokenizer.pad_token_id, bos_token_id=tokenizer.bos_token_id, eos_token_id=tokenizer.eos_token_id ) token = outputs.sequences[0, input_length:] output_str = tokenizer.decode(token) print(output_str) ```