Safetensors
llama

Usage

Support for this model will be added in the upcoming transformers release. In the meantime, please install the library from source:

pip install transformers

We can now run inference on this model:

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM

# Load the tokenizer and model
model_path = "YaoLuzjut/Llama-3.1-6.3B-It-Alpaca"
tokenizer = AutoTokenizer.from_pretrained(model_path)

device = 'cuda'
dtype = torch.bfloat16
model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=dtype, device_map=device)

# Prepare the input text
prompt = 'Complete the paragraph: our solar system is'
inputs = tokenizer.encode(prompt, return_tensors='pt').to(model.device)

# Generate the output
outputs = model.generate(inputs, max_length=20)

# Decode and print the output
output_text = tokenizer.decode(outputs[0])
print(output_text)

Evaluation Results

Zero-shot performance. Evaluated using select datasets from the LM Evaluation Harness with additions:

PIQA HellaSwag OpenbookQA ARC-e ARC-c MMLU CMMLU WinoGrande
0.7383±0.0103 0.5323±0.0050 0.3080±0.0207 0.7260±0.0092 0.4684±0.0146 0.6567±0.0038 0.5515±0.0045 0.6646±0.0133
@article{lu2024reassessing,
  title={Reassessing Layer Pruning in LLMs: New Insights and Methods},
  author={Lu, Yao and Cheng, Hao and Fang, Yujie and Wang, Zeyu and Wei, Jiaheng and Xu, Dongwei and Xuan, Qi and Yang, Xiaoniu and Zhu, Zhaowei},
  journal={arXiv preprint arXiv:2411.15558},
  year={2024}
}
Downloads last month
4
Safetensors
Model size
6.29B params
Tensor type
F32
·
Inference API
Unable to determine this model's library. Check the docs .

Model tree for YaoLuzjut/Llama-3.1-6.3B-It-Alpaca

Finetuned
(628)
this model

Dataset used to train YaoLuzjut/Llama-3.1-6.3B-It-Alpaca