How to Use
from unsloth import FastLanguageModel
model, tokenizer = FastLanguageModel.from_pretrained(
model_name = "Chimmyy/Llama3.1-8B-Finance",
max_seq_length = 1024,
dtype = None,
load_in_4bit = True,
)
FastLanguageModel.for_inference(model)
inputs = tokenizer(
[
prompt.format(
"What are the advantages of investing in bonds?", # instruction
"", # input
"", # output - leave empty for model
)
], return_tensors = "pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens = 64, use_cache = True) # Change max_new_tokens as needed
result = tokenizer.batch_decode(outputs)
print(result)
Uploaded model
- Developed by: Chimmyy
- License: apache-2.0
- Finetuned from model : unsloth/Meta-Llama-3.1-8B-bnb-4bit
This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.
Model tree for Chimmyy/Llama3.1-8B-Finance
Base model
meta-llama/Llama-3.1-8B
Quantized
unsloth/Meta-Llama-3.1-8B-bnb-4bit