🍷 Llama-3.2-Nemotron-3B-Instruct

This is a finetune of meta-llama/Llama-3.2-3B-Instruct (specifically, unsloth/Llama-3.2-3B-Instruct-bnb-4bit).

It was trained on the nvidia/HelpSteer2 dataset, similar to nvidia/Llama-3.1-Nemotron-70B-Instruct-HF, using Unsloth.

πŸ’» Usage

!pip install -qU transformers accelerate

from transformers import AutoTokenizer
import transformers
import torch

model = "itsnebulalol/Llama-3.2-Nemotron-3B-Instruct"
messages = [{"role": "user", "content": "How many r in strawberry?"}]

tokenizer = AutoTokenizer.from_pretrained(model)
prompt = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="auto",
)

outputs = pipeline(prompt, max_new_tokens=256, do_sample=True, temperature=0.7, top_k=50, top_p=0.95)
print(outputs[0]["generated_text"])

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month
76
Safetensors
Model size
3.21B params
Tensor type
BF16
Β·
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API has been turned off for this model.

Model tree for itsnebulalol/Llama-3.2-Nemotron-3B-Instruct

Finetuned
(197)
this model
Quantizations
9 models

Dataset used to train itsnebulalol/Llama-3.2-Nemotron-3B-Instruct